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Introduction 

The  development  of  breast  cancer,  including  late  stage  events  such  as  metastasis  and 
drug  resistance,  requires  mutations.  The  origins  of  most  of  these  mutations  are 
unknown.  We  recently  implicated  the  DNA  cytosine  deaminase  APOBEC3B.  This  Idea 
Award  studies  tests  the  hypothesis  that  APOBEC3B  causes  a  genome  wide  hypermutable 
state  and  the  hypothesis  that  APOBEC3B  alters  the  epigenome  by  cytosine  deamination 
and  methyl-cytosine  deamination  mechanisms,  respectively.  Positive  results  will  be 
significant  because  they  will  delineate  a  major  source  of  mutations  and  epigenetic 
changes  in  breast  cancer,  and  thereby  pave  the  way  for  new  diagnostic/prognostic  tests 
and  methods  to  treat  breast  cancer  by  preventing  the  activity  of  this  enzyme. 
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Keywords 

APOBEC3B;  Apolipoprotein  B  mRNA  editing  enzyme,  catalytic  polypeptide-like-3  B; 
sometimes  abbreviated  A3B;  one  of  7  human  A3  family  members 

C;  Cytosine  (a  DNA  and  RNA  base) 

DNA;  Deoxyribonucleic  acid 

ER;  estrogen  receptor  (molecular  target  of  the  breast  cancer  therapeutic  tamoxifen) 

G;  Guanine  (a  DNA  and  RNA  base) 

MeC;  5-methyl-cytosine  (a  common  epigenetic  modification  in  human  DNA) 
qPCR;  Quantitative  polymerase  chain  reaction 

shRNA;  short  hairpin  RNA  (a  molecular  tool  used  to  decrease  gene  expression) 

SOW;  Statement  of  Work 

T;  Thymine  (a  base  typically  found  in  DNA,  but  also  the  product  of  APOBEC3B-catalyzed 
MeC  deamination) 

U;  Uracil  (a  base  typically  found  in  RNA  but  also  the  product  of  APOBEC3B-catalyzed  C 
deamination) 
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Overall  Project  Summary  (significant  revisions  and/or  additions  to  the  original 
final  report  are  highlighted  in  yellow) 


This  section  provides  a  final  report  and  a  narrative  of  progress  over  the  2  year 
duration  of  this  Idea  award.  Please  see  Table  1  below  for  an  updated  SOW  including 
final  reports  of  the  status  of  each  task.  A  summary  and  discussion  (as  requested)  of  the 
progress  on  each  aim  follows. 


Aim  1  -  Does  A3B  cause  a  genome-wide  hypermutable  state? 

Aim  1  rationale:  Although  we  have  demonstrated  APOBEC3B  up-regulation  in 
tumors  and  APOBEC3B  activity  in  the  nuclear  extracts  of  several  breast  cancer  cell 
lines[l],  we  still  need  to  overcome  the  highest  hurdle  and  demonstrate  that  APOBEC3B 
actually  alters  the  genetic  landscape  of  a  breast  cancer  cell.  This  will  be  done  by  deep¬ 
sequencing  to  document  the  APOBEC3B-dependent  contribution  to  the  overall  mutation 
distribution  in  cell  lines  and  by  performing  a  series  of  experiments  with  a  well- 
established  xenograft  tumor  model. 


Aim  1  -  Summary  of  Results,  Progress  and  Accomplishments  with  Discussion. 


Aim  1A  -  deep-sequencing  cell  lines:  We  have  now  deep  sequenced  several 
different  cancer  cell  lines,  and  have  encountered  significant  genetic  heterogeneity  in 
most  instances  that  precluded  analyses  of  APOBEC3B  mutations.  However,  we  have 
succeeded  in  one  system  in  which  APOBEC3B  can  be  expressed  inducibly.  These  resu 
are  detailed  in  Appendix  A,  an  open  access  publication  by  Akre  eta/.,  2016,  PLo 
One  (PMID:  27163364  PMCID:  PMC4862684)  and  discussed  here 


suits 

S 


Figure  1  shows  doxycycline-induced  expression  of  APOBEC3B.  Figure  2  shows  a 
titration  of  doxycycline  levels  that  induce  APOBEC3B  expression  and  result  in 
approximately  90%  cell  death.  This  level  of  doxycycline  was  used  to  induce  10-rounds  of 
APOBEC3B  expression  and  mutagenesis  in  daughter  pools.  Representative  cells  were 
then  outgrown  from  each  pool  (single  cell  cloned)  and  subjected  to  microarray  analysis 
for  single  nucleotide  polymorphisms  (SNPs)  and  full  genome  DNA  sequencing.  Figure  3 
shows  the  results  of  the  microarray  analysis  with  increased  numbers  of  SNPs  and 
increased  levels  of  copy  number  variations  (CNVs).  Figure  4  shows  the  results  of  the  full 
genome  sequence  analysis.  As  anticipated,  APOBEC3B  mutations  were  detected 
throughout  the  genome  at  elevated  frequencies.  However,  unexpectedly,  we  discovered 
that  this  cell  line  is  defective  in  mismatch  repair  and  had  very  high  background  levels  of 
mutation,  which  precluded  more  extensive  analyses  of  the  APOBEC3B  mutational 
landscape.  Nevertheless,  this  series  of  experiments  demonstrated  the  genome-wide 
impact  of  APOBEC3B  and  provided  several  valuable  lessons  to  apply  in  future  studies. 


Aim  IB  -  xenograft  experiments  in  mice:  The  proposed  xenograft  studies  took 
longer  than  expected  in  part  due  to  repeating  key  experiments  and  due  to  adding  an 
over-expression  study.  However,  we  are  delighted  to  report  that  the  results  are  positive, 
and  that  therapy  (tamoxifen)  resistance  in  the  ER+  breast  cancer  cell  line  MCF-7L  is 
dependent  upon  APOBEC3B.  Specifically,  APOBEC3B  knockdown  slows  down  the  rate  of 
tumor  evolution  and  drug  resistance,  and  APOBEC3B  over-expression  speeds-up  tumor 
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evolution  and  drug  resistance.  These  analyses,  included  extensive  methodologies,  are 

detailed  in  Appendix  B,  an  open  access  publication  by  Law  et  a!.,  2016,  Science 
Advances  (PMID:  27730215  PMCID:  PMC5055383)  and  discussed  here. 

Figure  1  reports  clinical  data  from  our  Dutch  collaborators.  A  significant  correlation 
is  evident  between  APOBEC3B  mRNA  levels  in  primary  tumors  and  progression  free 
survival  upon  disease  recurrence.  Essentially,  the  higher  the  APOBEC3B  levels  in  the 
original  tumor,  the  poorer  the  outcomes  in  the  recurrent  setting  for  ER+  disease 
subjected  to  tamoxifen  monotherapy.  Figure  2  shows  that  shRNA  mediated  knockdown 
of  AP0BEC3B  in  the  ER+  breast  cancer  cell  line  MCF-7L  is  robust  and,  importantly,  that 
it  does  not  alter  cellular  growth  rates  in  culture.  Figure  3  is  a  representative  xenograft 
experiment  in  which  AP0BEC3B  knockdown  improves  the  durability  of  tamoxifen 
treatment  by  reducing  the  rate  of  developing  drug  resistance.  Figures  4  and  5  show  the 
results  of  AP0BEC3B  overexpression  using  a  novel  lentivirus-based  construct  (schematic 
in  Figure  4A).  Importantly,  overexpression  of  the  wildtype  AP0BEC3B  enzyme,  but  not  a 
catalytically  dead  form,  reduces  the  durability  of  tamoxifen  treatment  by  accelerating 
the  rate  of  developing  drug  resistance.  Taken  together,  these  results  are  the  first  to 
demonstrate  that  altering  the  cellular  levels  of  a  single  enzyme,  AP0BEC3B,  can 
systematically  influence  the  rate  of  acquired  resistance  to  tamoxifen  therapy. 

Although  this  study  was  successful,  it  also  faced  some  technical  challenges.  For 
instance,  the  MCF-7L  cell  line  is  genetically  heterogeneous,  which  precluded  the 
identification  of  the  resistance  mutations  by  exome  sequencing.  Flowever,  we  have 
learned  from  these  challenges  and  have  taken  a  number  of  precautions,  including  the 
utilization  of  pre-defined  clonogenic  breast  cancer  cell  lines,  that  we  are  confident  will 
enable  future  successes. 


Aim  2  -  Does  A3B  impact  genomic  MeC  levels? 

Aim  2  rationale:  The  impetus  for  this  aim  stems  from  observations  that  the  related 
DNA  deaminases  AID  and  AP0BEC3A  elicit  MeC-to-T  editing  activity  in  vitro[ 2-4],  and 
AID  has  been  implicated  in  altering  the  MeC  status  of  mouse  germ  and  stem  cel ls[ 5,  6]. 
Since  AID  is  not  expressed  in  normal  breast  epithelium  or  breast  tumor  cells  and  only 
A3B  is  up-regulated  in  breast  tumors[l],  we  hypothesize  that  A3B  alone  has  the 
capacity  to  remodel  the  breast  cancer  MeC  landscape.  This  hypothesis  will  be  tested 
here  in  experiments  that  are  complementary  to  those  described  above. 


Aim  2  -  Summary  of  Results ,  Progress  and  Accomplishments  with  Discussion. 

Aims  2A-C:  We  have  completed  the  original  studies  as  proposed  and  have  found 
that  APOBEC3B  is  not  likely  to  have  a  role  in  genomic  DNA  demethylation  (although  it 


can  do  so  biochemically).  In  essence,  bisulfite  sequencing  has  not  identified  any  sites  in 
the  genome  that  become  hypomethylated  in  APOBEC3B  over-expressing  cell  lines  in 
comparison  to  non-APOBEC3B  expressing  controls.  We  are  concerned  that  the 


developmental  fate  of  cell  lines  is  difficult  to  alter,  and  that  future  studies  in  mice  in  vivo 
may  be  more  informative. 
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Table  1.  Progress  on  original  SOW  with  current  status/progress  highlighted  in  blue. 

Aim  1:  Does  AP0BEC3B  cause  a  genome-wide  hypermutable  state? 

Task 

Methods  employed 

Timeline  and 
Status 

Engineering  breast  cancer  cell  lines  MDA-MB-231,  MDA- 
MB-453,  MDA-MB-468,  and  HCC1569  to  knock-down 
endogenous  A3B  and  generate  control  lines;  generate 
multiple  sub-clones  for  each  line. 

Molecular  biology, 
cell  culture,  qRT-PCR 

Months  1-6; 
completed  as 
proposed 

Preparation  of  genomic  DNA  from  selected  cell  lines  (likely 
HCC1569)  prepared  in  the  above  tasks  to  express  high  or 
low  levels  of  A3B.  Delivery  of  DNA  to  sequencing  facility 
for  whole  exome  capture,  deep  sequencing,  and 
data/sequence  analysis. 

General  molecular 
biology  techniques, 
data/sequence 
analysis, 
bioinformatics 

Months  6-18; 
sequencing  done 
but  results 
ambiguous 
because  most  cell 
lines  were 
heterogenous;  we 
have  had  success 
with  one  cell  line 
and  the  results 
were  published  in 
PLoS  One 
(Appendix  A  - 
Akre  et  al.,  2016). 

Completion  of  IACUC  forms  for  approval  of  animal 
experiments  (80  NCr  nude  mice  are  proposed  for  the  full 
xenograft  experiment  with  numbers  determined  by  power 
analysis  -  details  can  be  found  in  the  main  text  of  the 
proposal).  Once  approved,  the  engineered  cell  lines 
described  above  (and  in  the  narrative)  will  begin  being 
xenografted  into  mice  and  therapies  administered. 

Cell  culture,  mouse 
model  techniques 

Months  1-5  for 
IACUC  review, 
months  6-18  for 
animal 

procurement  and 

xenograft 

experiments; 

IACUC  approval 
was  received,  the 
cell  lines  were 
engineered,  and 
the  xenograft 
experiments  were 
done 

Tumor  collection  and  analysis  from  xenografts. 

Mouse  model 
techniques,  cancer- 
molecular  biology 
techniques,  qRT-PCR, 
sequence  analysis 

Months  16-20; 
done  but  DNA 
sequencing  results 
were  ambiguous 
because  the  cell 
line  was 
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heterogenous 

Prepare  data  for  publication.  Publish  manuscript. 

Data  analysis  and 
writing 

Months  20-24;  a 
manuscript  has 
been  published  in 
Science  Advances 

(Appendix  B  - 
Law  et  al.,  2016). 

Aim  2:  Does  APOBEC3B  impact  the  genomic  methyl-cytosine  landscape? 

Task 

Methods 

Timeframe 

Engineering  of  cell  lines  MDA-MB-231,  MDA-MB-453, 
MDA-MB-468,  and  HCC1569  to  knock-down  endogenous 
A3B.  Passage  of  lines  from  generations  2-32,  with 
collection  of  DNA  at  generations  2,  4,  8, 16,  and  32. 
Assessment  of  MeC  levels  using  MeC  ELISA  kit. 

Cell  culture, 
molecular  biology 
techniques,  western 
blotting,  qRT-PCR, 
ELISA 

Months  1-6; 
completed  as 
proposed. 

In  parallel  with  the  task  immediately  above,  the  same  DNA 
samples  will  be  assessed  for  MeC  content  using  HPLC- 
MS/MS,  rather  than  ELISA. 

Cell  culture, 
molecular  biology 
techniques,  western 
blotting,  qRT-PCR, 
HPLC-MS/MS 

Months  2-7; 
completed  as 
proposed. 

Again,  the  same  DNA  samples  as  in  the  previous  2  tasks 
will  be  subjected  to  bisulfite  sequencing  to  assess  DNA 
methylation  status  in  regions  of  the  genome  that  are 
known  to  be  effected  by  hypomethylation  (see  narrative 
for  further  details). 

Cell  culture, 
molecular  biology 
techniques,  deep¬ 
sequencing  western 
blotting,  qRT-PCR, 
bisulfite  sequencing 

Months  3-12; 
completed  but  the 
bisulfite  DNA 
sequencing  results 
were  ambiguous 
because  the  cell 
line  was 
heterogenous 

We  will  engineer  the  non-tumorigenic  cell  lines  MCF-10A 
(previously  acquired  from  ATCC)  and  hTERT-HMEC  (a  gift 
from  the  lab  of  Dr.  Vitaly  Polunovsky)  to  over-express  A3B 
by  transfection  with  a  linearized,  tagged  A3B-espression 
cassette  followed  by  selection  of  stable  clones.  Control 
lines  will  be  generated  using  the  catalytically  dead,  tagged 
A3B-E255Q. 

Cell  culture, 
molecular  biology 
techniques,  western 
blotting,  qRT-PCR 

Months  3-12; 
completed  as 
proposed. 

Assessment  of  A3B  over-expressing  engineered  cell  lines’ 
ability  to  alter  the  levels  of  MeC  in  the  cell  genome 
(determined  by  ELISA,  HPLC-MS/MS,  and  bisulfite¬ 
sequencing). 

Cell  culture,  cancer- 
molecular  biology 
techniques,  ELISA, 
HPLC-MS/MS, 
bisulfite  sequencing 

Months  8-16; 
completed  as 
proposed. 

Bisulfite-coupled  deep  sequencing  will  be  performed  to 

Bisulfite-coupled 

Months  15-22; 
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quantify  the  levels  of  demethylation  and  identify  any 
demethylation  hot-spots  and  mutational  spectra  as  a 
function  of  A3B  expression.  Samples  sent  for  sequencing 
will  be  pairs  of  A3B  high/A3B  knock-down  and  A3B  over- 
expressed/A3B-E255Q  over-expressed  DNA  determined 
empirically  from  the  previous  aims  to  have  positive  results 
by  ELISA,  HPLC-MS/MS,  and  local  bisulfite  sequencing. 

deep  sequencing 

completed  but  the 
bisulfite  DNA 
sequencing  results 
were  ambiguous 
because  the  cell 
line  was 
heterogenous 

Analysis  and  compilation  of  data.  Assembly  of  manuscript. 

Data  analysis  and 
writing 

Months  20-24;  we 
have  invested 
almost  all  effort  in 
the  success  of  Aim 

1  once  we  learned 
Aim  2  would  test 
negative. 
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Key  Research  Accomplishments 


1)  Cell  lines  have  been  constructed  that  inducibly  express  APOBEC3B,  and  demonstrate 
the  genome-wide  nature  of  this  breast  cancer  mutagenesis  mechanism  [Appendix 
Akre  eta/.,  2016,  PLoS  One  (PMID:  27163364  PMCID:  PMC4862684)]. 


£ 


2)  Xenograft  experiments  with  the  ER+  breast  cancer  cell  line  MCF7L  have  demonstrated 
that  APOBEC3B  is  a  significant  driver  of  tumor  evolution  and  resistance  to  the 


SERM  tamoxifen  [Appendix  B:  Law  eta/.,  2016,  Science  Advances  (PMID: 


27730215  PMCID:  PMC5055383)]. 
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Conclusion 


We  are  thrilled  to  report  that  our  xenograft  studies  have  been  successful,  and  allowed  us 
to  demonstrate  that  APOBEC3B  drives  tamoxifen  resistance  in  an  ER+  breast  cancer  cell 
line  (Law  et  al.,  2016,  Science  Advances).  Due  to  the  fundamental  nature  of  the 
underlying  mutational  process  and  the  breadth  of  APOBEC3B  over-expression  in  breast 
and  other  cancer  types,  this  result  is  likely  to  be  broadly  applicable.  The  next  step  will  be 
developing  strategies  to  stop  APOBEC3B  driven  breast  tumor  evolution  in  the  hope  of 
improving  the  efficacy  of  existing  therapies  such  as  tamoxifen,  which  can  be  undermined 
by  tumor  evolution  and  the  acquisition  of  resistance  mutations. 
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Abstract 

Molecular,  cellular,  and  clinical  studies  have  combined  to  demonstrate  a  contribution  from 
the  DNA  cytosine  deaminase  APOBEC3B  (A3B)  to  the  overall  mutation  load  in  breast, 
head/neck,  lung,  bladder,  cervical,  ovarian,  and  other  cancer  types.  However,  the  complete 
landscape  of  mutations  attributable  to  this  enzyme  has  yet  to  be  determined  in  a  controlled 
human  cell  system.  We  report  a  conditional  and  isogenic  system  for  A3B  induction,  genomic 
DNA  deamination,  and  mutagenesis.  Human  293-derived  cells  were  engineered  to  express 
doxycycline-inducible  A3B-eGFP  or  eGFP  constructs.  Cells  were  subjected  to  1 0  rounds  of 
A3B-eGFP  exposure  that  each  caused  80-90%  cell  death.  Control  pools  were  subjected  to 
parallel  rounds  of  non-toxic  eGFP  exposure,  and  dilutions  were  done  each  round  to  mimic 
A3B-eGFP  induced  population  fluctuations.  Targeted  sequencing  of  portions  of  TP53  and 
MYC  demonstrated  greater  mutation  accumulation  in  the  A3B-eGFP  exposed  pools. 

Clones  were  generated  and  microarray  analyses  were  used  to  identify  those  with  the  great¬ 
est  number  of  SNP  alterations  for  whole  genome  sequencing.  A3B-eGFP  exposed  clones 
showed  global  increases  in  C-to-T  transition  mutations,  enrichments  for  cytosine  mutations 
within  A3B-preferred  trinucleotide  motifs,  and  more  copy  number  aberrations.  Surprisingly, 
both  control  and  A3B-eGFP  clones  also  elicited  strong  mutator  phenotypes  characteristic  of 
defective  mismatch  repair.  Despite  this  additional  mutational  process,  the  293-based  sys¬ 
tem  characterized  here  still  yielded  a  genome-wide  view  of  A3B-catalyzed  mutagenesis  in 
human  cells  and  a  system  for  additional  studies  on  the  compounded  effects  of  simultaneous 
mutation  mechanisms  in  cancer  cells. 


Introduction 

Cancer  genome  sequencing  studies  have  defined  approximately  30  distinct  mutation  signatures 
(reviewed  by  [1-4]).  Some  signatures  are  large-scale  confirmations  of  established  sources  of 
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APOBEC3B  Mutation  Signature  in  Human  Cells 


DNA  damage  that  escaped  repair  or  were  repaired  incorrectly.  The  largest  is  water-mediated 
deamination  of  methyl-cytosine  bases,  which  manifest  as  C-to-T  transitions  in  genomic  5’-CG 
motifs  [5].  This  process  impacts  almost  all  cancer  types  and  accumulates  as  a  function  of  age. 
Other  well  known  examples  include  ultraviolet  radiation,  UV-A  and  UV-B,  which  crosslink 
adjacent  pyrimidine  bases  and  result  in  signature  C-to-T  transitions  [6],  and  tobacco  mutagens 
such  as  nitrosamine  ketone  (NNK),  which  metabolize  into  reactive  forms  that  covalently  bind 
guanine  bases  and  result  in  signature  G-to-T  transversions  [7].  These  latter  mutagenic  pro¬ 
cesses  are  well  known  drivers  of  skin  cancer  and  lung  cancer,  respectively,  but  also  contribute 
to  other  tumor  types.  A  lesser-known  but  still  significant  example  of  a  mutagen  is  the  dietary 
supplement  aristolochic  acid,  which  is  derived  from  wild  ginger  and  related  plants  and  metabo¬ 
lized  into  reactive  species  that  covalently  bind  adenine  bases  and  cause  A-to-T  transversions 
[8,  9].  Aristolochic  acid  mutation  signatures  are  evident  in  urothelial  cell,  hepatocellular,  and 
bladder  carcinomas.  Other  confirmed  mutation  sources  include  genetic  defects  in  recombina¬ 
tion  repair  ( BRCA1 ,  BRCA2,  etc.),  post-replication  mismatch  repair  ( MSH2 ,  MLH1,  etc.),  and 
DNA  replication  proofreading  function,  which  manifest  as  microhomology-mediated  inser¬ 
tion/deletion  mutations,  repeat/microsatellite  slippage  mutations,  and  transversion  mutation 
signatures,  respectively  [4,  5, 10]. 

The  largest  previously  undefined  mutation  signature  in  cancer  is  C-to-T  transitions  and  C- 
to-G  transversions  within  5’-TC  dinucleotide  motifs  [5, 11, 12].  This  mutation  signature  occurs 
throughout  the  genome,  as  well  as  less  frequently  in  dense  clusters  called  kataegis.  This  signa¬ 
ture  is  ascribable  to  the  enzymatic  activity  of  members  of  the  APOBEC  family  of  DNA  cytosine 
to  uracil  deaminases  [5, 11-15].  Human  cells  encode  up  to  9  distinct  APOBEC  family  members 
with  demonstrated  C-to-U  editing  activity,  and  7/9  have  been  shown  to  prefer  5’-TC  dinucleo¬ 
tide  motifs  in  single-stranded  DNA  substrates:  APOBEC1,  APOBEC3A,  APOBEC3B  (A3B), 
APOBEC3C,  APOBEC3D,  APOBEC3F,  and  APOBEC3H.  In  contrast,  AID  and  APOBEC3G 
prefer  5’RC  and  5’CC,  respectively  (R  =  purine;  reviewed  by  [16,  17]).  The  size  and  similarity 
of  this  protein  family,  as  well  as  the  formal  possibility  that  another  DNA  damage  source  may 
be  responsible  for  the  same  mutation  signature  [18],  have  made  DNA  sequencing  data  and 
informatics  analyses  open  to  multiple  interpretations. 

However,  independent  [13, 19]  and  subsequent  [14, 15,  20-26]  studies  indicate  that  at  least 
one  DNA  deaminase  family  member,  A3B,  has  a  significant  role  in  causing  these  types  of  muta¬ 
tions  in  cancer.  A3B  localizes  to  the  nucleus  throughout  the  cell  cycle  except  during  mitosis 
when  it  appears  excluded  from  chromatin  [19].  A3B  is  upregulated  in  breast  cancer  cell  lines 
and  primary  tumors  at  the  mRNA,  protein,  and  activity  levels  [13,  20,  27].  Endogenous  A3B  is 
the  only  detectable  deaminase  activity  in  nuclear  extracts  of  many  cancer  cell  lines  representing 
a  broad  spectrum  of  cancer  types  (breast,  head/neck,  lung,  ovarian,  cervix,  and  bladder  [13,  20, 
27]).  Endogenous  A3B  is  required  for  elevated  levels  of  steady  state  uracil  and  mutation  fre¬ 
quencies  in  breast  cancer  cell  lines  [13].  Overexpressed  A3B  induces  a  potent  DNA  damage 
response  characterized  by  gamma-H2AX  and  53BP1  accumulation,  multinuclear  cell  forma¬ 
tion,  and  cell  cycle  deregulation  [13,  21,  22].  A3B  levels  correlate  with  overall  mutation  loads  in 
breast  and  head/neck  tumors  [13,  23].  The  biochemical  deamination  preference  of  recombi¬ 
nant  A3B,  5’TCR,  is  similar  to  the  actual  cytosine  mutation  pattern  observed  in  breast,  head/ 
neck,  lung,  cervical,  and  bladder  cancers  [13, 14,  20].  Human  papillomavirus  (HPV)  infection 
induces  A3B  expression  in  several  human  cell  types,  providing  a  link  between  viral  infection 
and  the  observed  strong  APOBEC  mutation  signatures  in  cervical  and  some  head/neck  and 
bladder  cancers  [28-30],  The  spectrum  of  oncogenic  mutations  in  PIK3CA  is  biased  toward 
signature  A3B  mutation  targets  in  HPV-positive  head/neck  cancers  [23],  Last  but  not  least, 
high  A3B  levels  correlate  with  poor  outcomes  for  estrogen  receptor-positive  breast  cancer 
patients  [25, 26,  31], 


PLOS  ONE  |  DOI:1 0.1 371 /journal. pone. 01 55391  May  10,  2016 


2/17 


PLOS 


ONE 


APOBEC3B  Mutation  Signature  in  Human  Cells 


Despite  this  extensive  and  rapidly  growing  volume  of  genomic,  molecular,  and  clinical 
information  on  A3B  in  cancer,  the  association  between  A3B  and  APOBEC  mutational  signa¬ 
tures  has  so  far  only  been  correlative,  and  a  mechanistic  demonstration  of  this  enzyme’s  activ¬ 
ity  on  the  human  genome  has  yet  to  be  determined.  Here  we  report  further  development  of  a 
human  293  cell-based  system  for  conditional  expression  of  human  A3B.  The  results  reveal,  for 
the  first  time  in  a  human  cell  line,  the  genomic  landscape  of  A3B  induced  mutagenesis. 

Materials  and  Methods 

Cell  Lines 

We  previously  reported  T-REx-293  cells  that  conditionally  express  A3B  [13] .  However,  the 
mother,  daughter,  and  granddaughter  lines  described  here  are  new  in  order  to  ensure  a  single 
cell  origin  and  have  all  of  the  controls  derived  in  parallel.  T-REx-293  cells  were  cultured  in 
high  glucose  DMEM  (Hyclone)  supplemented  with  10%  FBS  and  0.5%  Pen/Strep.  Single  cell 
derived  mother  lines,  A  and  C,  were  obtained  by  limiting  dilution  in  normal  growth  medium. 
These  mother  clones  were  transfected  with  linearized  pcDNA5/TO-A3Bintron-eGFP  (A3Bi- 
eGFP)  or  pcDNA5/TO-eGFP  vectors  [13,  32],  selected  with  200  pg/mL  hygromycin,  and 
screened  as  described  in  the  main  text  to  identify  drug-resistant  daughter  clones  capable  of 
Dox-mediated  induction  of  A3Bi-eGFP  or  eGFP,  respectively.  The  encoded  A3B  enzyme  is 
identical  to  “isoform  a”  in  GenBank  (NP_004891.4).  GFP  flow  cytometry  was  done  using  a 
FACSCanto  II  instrument  (BD  Biosciences). 

Immunoblots 

Whole  cell  lysates  were  prepared  by  suspending  lxl  06  cells  in  300pL  lOx  reducing  sample  buffer 
(125mM  Tris  pH  6.8, 40%  Glycerol,  4%SDS,  5%  2-mercaptoethanol  and  0.05%  bromophenol 
blue).  Soluble  proteins  were  fractionated  by  4%  stacking  and  12%  resolving  SDS  PAGE,  and  trans¬ 
ferred  to  PVDF  membranes  using  a  wet  transfer  BioRad  apparatus.  Membranes  were  blocked  for 
1  hr  in  4%  milk  in  PBS  with  0.05%  sodium  azide.  Primary  antibody  incubations,  anti-GFP 
(JL8-BD  Clontech)  and  anti-P-actin  (Cell  Signaling)  were  done  in  at  a  1:1000  dilution  in  4%  milk 
diluted  in  PBST,  and  incubation  conditions  ranged  from  4-8  degrees  C  for  2-16  hrs.  Membranes 
were  then  washed  3  times  for  5  minutes  in  PBST.  Secondary  antibody  incubations,  anti-mouse  680 
(1:20000)  and  anti-rabbit  800  (1:20000),  were  done  in  4%  milk  diluted  in  PBST  with  0.01%  SDS, 
and  incubation  conditions  ranged  from  4-8  degrees  C  for  2-16  hrs.  The  resulting  membranes 
were  washed  3  times  for  5  minutes  in  PBST  and  imaged  using  Licor  instrumentation  (Odyssey). 

DNA  Deaminase  Activity  Assays 

This  assay  was  adapted  from  published  procedures  [27,  33].  Whole-cell  extracts  were  prepared 
from  1x10s  cells  by  sonication  in  200pL  HED  buffer  (25mM  HEPES,  5mM  EDTA,  10%  glyc¬ 
erol,  ImM  DTT,  and  one  tablet  protease  inhibitor- Roche  per  50mL  HED  buffer).  Debris  was 
removed  by  a  30  min  maximum  speed  spin  in  a  tabletop  micro-centrifuge  at  4  degrees  C.  The 
supernatant  was  then  used  in  20  pL  deamination  reactions  that  contained  the  following:  lpL  of 
4pM  fluorescently-labeled  43-mer  oligo  (5’-ATTATTATTATTCGAATGGATTTATTTATT 
TATTTATTTATTT-fluorescein)  containing  a  single  interior  5’-TC  substrate,  9.25pL  UDG 
(NEB),  0.25pL  RNase,  2pL  lOx  UDG  buffer  (NEB),  16.5pL  lysate.  Reactions  were  incubated  at 
37  degrees  C  for  lh.  2pL  1M  NaOH  was  added  and  reaction  was  heated  to  95  degrees  C  in  a 
thermocycler  for  10  min.  22pL  of  2x  formamide  loading  buffer  was  added  to  each  sample.  5pL 
of  each  reaction  was  fractionated  on  a  15%  TBE  Urea  Gel  and  imaged  using  a  SynergyMx  plate 
reader  (BioTek). 
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Differential  DNA  Denaturation  (3D)  PCR  Experiments 

This  assay  was  adapted  from  published  procedures  [13,  34] .  Genomic  DNA  was  extracted  from 
samples  using  a  PureGene  protocol  (Gentra)  and  quantified  using  Nanodrop  instrumentation 
(ThermoFisher  Scientific).  20  ng  of  genomic  DNA  was  subjected  to  one  round  of  normal  high 
denaturation  temperature  PCR  using  Taq  Polymerase  (Denville)  and  primers  for  MYC  (5’-AC 
GTTAGCTTCACCAACAGG  and  3’TT  CAT  C  AA  AA  AC  AT  CAT  CAT  CCAG)  or  TP53  (5’GA 
GCTGGAGCTTAGGCTCCAGAAAGGACAA  and  3’TTCCTAGCACTGCCCAACAACAC 
CAGC).  383  bp  and  376  bp  PCR  products  were  purified  and  quantified  using  qPCR  with 
nested  primer  sets  and  SYBR  Green  detection  (Roche  480  LightCycler;  5’ACGAGGAGGAGA 
ACTTCTACCAGCA  and  3’TT  CAT  CT  GCGACCCGG  ACGACGAG  A  for  MYC  and  5’TTCT 
CTTTTCCTATCCTGAGTAGTGGTAA  and  3’TTATGCCTCAGATTCACTTTTATCACC 
TTT  for  TP53).  Equivalent  amounts  of  each  PCR  product  were  then  used  for  3D-PCR  using 
the  same  nested  PCR  primer  sets.  The  resulting  291  and  235  bp  products  were  fractionated  by 
agarose  gel  electrophoresis,  purified  using  QIAEX II  (Qiagen),  cloned  into  a  pjet  vector  (Fer- 
mentas),  and  subjected  to  sequencing  (GENEWIZ).  Alignments  and  mutation  calls  were  done 
with  Sequencher  (Gene  Codes  Corporation). 


SNP  Array  Based  Mutational  Analysis 

Granddaughter  clones  were  established  by  limiting  dilution  after  the  final  pulse  round.  Geno¬ 
mic  DNA  was  prepared  from  daughter  and  granddaughter  clones  using  the  Gentra  PureGene 
kit  (Qiagen,  Valencia,  CA),  quantified  by  agarose  gel  staining  with  ethidium  bromide  and  by 
NanoDrop  measurements  (Thermo  Scientific,  Wilmington,  DE),  and  subjected  to  SNP  array 
analyses  by  Source  BioScience  (Cambridge,  UK)  using  the  Human  OmniExpress-24vl-0  Bead- 
Chip  (Illumina,  San  Diego,  CA).  Raw  data  were  pre-processed  in  GenomeStudio  using  the 
Genotyping  Module  (Illumina,  San  Siego,  CA).  Genotype  clustering  was  performed  using  the 
humanomniexpress_24vl-0_a  cluster  file,  whereby  probes  with  a  GenCall  score  below  0.15, 
indicating  low  genotyping  reliability,  were  discarded.  All  samples  passed  quality  control  as 
assessed  by  call  rates  and  frequencies.  Genotypes  for  a  total  of  716,503  probes  were  used  for 
further  analyses. 

By  comparing  the  genotypes  of  the  granddaughter  clones  to  the  pre-pulsed  daughter  clones, 
six  classes  of  base  substitutions  could  be  determined  (C-to-T,  C-to-G,  C-to-A,  T-to-G,  T-to-C, 
and  T-to-A).  For  example,  a  C-to-T  transition  occurred  if  the  C/C  genotype  of  the  mother 
clone  changed  to  a  C/T  genotype  in  the  granddaughter  clone.  Given  the  design  of  some  micro¬ 
array  probes  (i.e.,  some  probes  detect  the  Watson-strand  rather  than  the  Crick- strand),  a 
change  from  a  G/G  in  the  mother  clone  to  a  G/A  genotype  in  the  granddaughter  clone  was  also 
scored  as  a  C-to-T  transition. 

Chromosomal  abnormalities  in  the  genomes  of  granddaughter  clones  were  identified  with 
Nexus  Copy  Number  7.5  software  (BioDiscovery,  Hawthorne,  CA),  using  the  matched  mother 
clone  as  a  reference.  SNPRank  segmentation  was  applied  and  the  segmented  copy  number  data 
were  further  processed  with  the  Tumor  Aberrations  Prediction  Suite  (TAPS)  to  obtain  allele- 
specific  copy  number  profiles  [35].  All  analyses  were  performed  using  the  R  statistical  environ¬ 
ment  (http://www.R-project.org).  The  number  of  copy  number  alterations  in  the  A3B-eGFP 
pulsed  clones  were  determined  based  on  the  difference  between  the  segment  copy  number 
counts  of  the  A3B-eGFP  pulsed  clones  and  the  eGFP  pulsed  clones.  Segments  which  the  eGFP 
pulsed  granddaughter  clones  were  not  identical  or  had  CN  of  0  were  excluded.  These  were  sub¬ 
sequently  binned  by  copy  number  loss  or  gain.  All  SNP  data  sets  have  been  deposited  in  the 
NCBI  GEO  database  under  accession  code  GSE78710. 
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Whole  Genome  Sequencing  (WGS) 

Library  preparation  and  sequencing  was  performed  by  the  Beijing  Genome  Institute  (BGI)  on 
the  Illumina  X  Ten  platform  to  an  average  of  34.5  ±  2.8  fold  coverage  using  purified  DNA  from 
Pulse  10  subclone  extractions  described  in  the  SNP  array  based  methods.  Sequences  were 
aligned  to  the  hgl9  reference  genome  using  BWA.  PCR  duplicates  were  marked  and  removed 
with  Picard-tools  (Broad).  Somatic  mutation  calling  was  conducted  using  mpileup  (SamTools), 
VarScan2  (Washington  University,  MO)),  and  MuTect  (Broad  Institute,  MA).  Mutations 
detected  by  both  VarScan2  and  MuTect  were  kept  as  true  somatic  mutations.  VarScan2  was 
run  using  procedures  describe  by  de  Bruin  and  coworkers  [24].  MuTect  was  run  using  default 
parameters.  Alignments  from  CGI  and  CG2  were  used  as  “normal”  controls  for  CA1  and  CA3, 
respectively.  Alignment  from  AG3  was  used  as  the  as  “normal”  control  for  AA3.  CGI  and  CG2 
were  used  as  normals  for  each  other  in  order  to  determine  their  somatic  mutations.  Somatic 
mutations  that  were  called  against  multiple  “normal”  genomes  were  merged  to  increase  detec¬ 
tion  rates  by  overcoming  regions  of  poor  sequence  coverage  unique  to  either  “normal”  genome. 
Variants  occurring  at  an  allele  frequency  greater  than  0.5  or  falling  into  repetitive  regions  or 
those  with  consistent  mapping  errors  were  removed  as  described  [24].  Somatic  indels  were 
called  by  VarScan2  and  filtered  using  the  same  methods  described  above.  Separation  of  muta¬ 
tion  signatures  present  in  our  WGS  data  was  performed  by  the  Somatic  Signatures  R  package 
using  nsNMF  decomposition  instead  of  Brunet  NMF  decomposition  as  described  by  Coving¬ 
ton  and  colleagues  [36].  Mutation  strand  asymmetries  were  analyzed  using  somatic  mutations 
from  all  samples  and  the  AsymTools  MatLab  software  [37].  All  raw  sequences  are  available 
from  NCBI  SRA  under  project  number,  PRJNA3 12357. 

Results 

System  for  Conditional  A3B  Expression 

Previous  studies  have  demonstrated  that  A3B  over-expression  induces  a  strong  DNA  damage 
response  resulting  in  cell  cycle  aberrations  and  eventual  cell  death  [13, 19, 21, 22,  32].  To  be  able  to 
control  the  degree  of  A3B-induced  genotoxicity,  we  built  upon  our  prior  studies  [13]  by  establish¬ 
ing  a  single  cell-derived  isogenic  system  for  conditional  and  titratable  expression  of  this  enzyme. 
T-REx-293  cells  were  subcloned  to  establish  an  isogenic  “mother”  line,  which  was  then  transfected 
stably  with  a  doxycycline  (Dox)  inducible  A3B-eGFP  construct  or  with  an  eGFP  vector  as  a  nega¬ 
tive  control.  The  resulting  “daughter”  clones  were  screened  by  flow  cytometry  to  identify  those 
that  were  non-fluorescent  without  Dox  (i.e.,  non-leaky)  and  uniformly  fluorescent  with  Dox  treat¬ 
ment  (Fig  1  A).  Daughter  clones  were  also  screened  for  Dox-inducible  overexpression  of  A3B- 
eGFP  or  eGFP  by  anti-GFP  immunoblotting  (Fig  IB).  A3B-eGFP  clones  were  uniformly  GFP- 
negative  without  Dox  treatment,  but  eGFP  only  clones  showed  a  low  level  of  leaky  expression  pos¬ 
sibly  related  to  greater  protein  stability.  As  additional  confirmation,  the  functionality  of  the 
induced  A3B-eGFP  protein  was  tested  using  an  in  vitro  ssDNA  deamination  assay  using  whole  cell 
extracts  [33].  As  expected,  only  extracts  from  Dox-treated  A3B-eGFP  cells  elicited  strong  ssDNA 
C-to-U  editing  activity  as  evidenced  by  the  accumulation  of  the  deaminated  and  hydrolytically 
cleaved  reaction  products  (labeled  P  in  Fig  1C;  see  Methods  for  details).  Nearly  identical  results 
were  obtained  with  a  parallel  set  of  independently  derived  daughter  clones  (Fig  ID  and  IE). 

Iterative  Rounds  of  A3B  Exposure 

To  establish  reproducible  A3B  induction  conditions,  a  series  of  cytotoxicity  experiments  was 
done  using  a  range  of  Dox  concentrations.  10,000  T-REx-293  A3B-eGFP  cells  were  plated  in 
10  cm  plates  in  triplicate,  treated  with  0, 1, 4,  or  16  ng/mL  Dox,  incubated  14  days  to  allow 


PLOS  ONE  |  DOI:1 0.1 371 /journal. pone. 01 55391  May  10,  2016 


5/17 


PLOS 


ONE 


APOBEC3B  Mutation  Signature  in  Human  Cells 


A  D 

C-series  Daughter  Clones  A-series  Daughter  Clones 

A3B-eGFP  eGFP  A3B-eGFP  eGFP 


B  E 

A3B-eGFP  eGFP  A3B-eGFP  eGFP 


c 


F 

A3B-eGFP  eGFP 


Dox  ng/mL 


GFP 


P-actin 


70  kDa 

30  kDa 

42  kDa 

A3B-eGFP  eGFP 


Dox  ng/mL 


g,  nP 


Dox  ng/mL  ^  N  ^  Q  N 


43  nt 


34  nt 


43  nt 


34  nt 


Fig  1.  A  conditional  system  for  A3B  expression.  (A)  Flow  cytometry  data  forT-REx-293  A3B-eGFP  and 
eGFP  daughter  cultures  24  hrs  after  Dox  treatment  (n  =  3;  mean  +/-  SD  of  technical  replicates).  (B)  Anti-GFP 
immunoblot  of  T-REx-293  A3B-eGFP  and  eGFP  daughter  cultures  24  hrs  after  Dox  treatment.  (C)  DNA 
cytosine  deaminase  activity  data  of  whole  cell  extracts  from  T-REx-293  A3B-eGFP  and  eGFP  daughter 
cultures  24  hrs  after  Dox  treatment.  (D,  E,  F)  Biological  replicate  data  using  A-series  daughter  clones  of  the 
experiments  described  in  panels  A,  B,  and  C,  which  used  C-series  daughter  clones. 

doi:10.1 371/journal. pone.0155391.g001 


time  for  colony  formation,  and  quantified  by  crystal  violet  staining.  As  expected,  higher  Dox 
concentrations  led  to  greater  levels  of  toxicity  (Fig  2A  and  2B).  Interpolation  from  a  best-fit 
logarithmic  curve  indicated  that  2  ng/mL  Dox  (C-series  daughter  clone)  or  1  ng/mL  Dox  (A- 
series  daughter  clone)  would  cause  80-90%  cytotoxicity,  and  this  concentration  was  selected 
for  subsequent  experiments.  Taken  together  with  the  measured  doubling  times  of  daughter 
clones,  each  A3B-eGFP  induction  series  was  estimated  to  span  7  days  (represented  in  the  work- 
flow  schematic  in  Fig  2C). 
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Fig  2.  A3B  induction  optimization  and  targeted  sequencing  results.  (A,  B)  Dose  response  curves  indicating  the 
relative  colony  forming  efficiency  (viability  index)  of  T-REx-293  A3B-eGFP  daughter  clones  treated  with  the  indicated 
Dox  concentrations  (n  =  3;  mean  viability  +/-  SD  of  biological  replicates).  The  dotted  lines  show  the  Dox  concentration 
required  to  induce  80%  cell  death  (2  or  1  ng/mL  for  C-  and  A-series  daughter  clones,  respectively).  (C)  A  schematic 
representation  of  the  experimental  workflow  depicting  the  viability  index  of  a  population  of  cells  induced  to  express  A3B- 
eGFP  and  recover  over  time.  Dox  treatment  occurs  on  day  1 ,  maximal  death  is  observed  on  days  3  or  4,  and  each 
population  typically  rebounds  to  normal  viability  levels  by  days  6  or  7.  (D-G)  A  summary  of  the  base  substitution 
mutations  observed  in  MYC  (241  bp)  and  TP53  (1 76  bp)  by  3D-PCR  analysis  of  genomic  DNA  after  1 0  rounds  of  A3B- 
eGFP  or  eGFP  exposure.  Red,  blue,  and  black  columns  represent  the  absolute  numbers  of  C-to-T,  C-to-A,  and  other 
base  substitution  types  in  sequenced  3D-PCR  products,  respectively.  Asterisks  indicate  cytosine  mutations  occurring  in 
5’-TC  dinucleotide  motifs.  The  adjacent  pie  graphs  summarize  the  base  substitution  mutation  load  for  each  3D-PCR 
amplicon.  The  number  of  sequences  analyzed  is  indicated  in  the  center  of  each  pie  graph. 

doi :  1 0. 1 371  /journal.pone.01 55391  .g002 

Each  T-REx-293  A3B-eGFP  daughter  clone  was  then  subjected  to  10  rounds  of  A3B-eGFP 
induction  and  recovery  (Fig  2C).  Iterative  exposures  to  A3B-eGFP  were  expected  to  generate 
dispersed  mutations  throughout  the  genome.  Ten  rounds  of  A3B-eGFP  induction  were  chosen 
as  a  sufficient  regimen  for  the  cells  to  accumulate  readily  detectable  levels  of  somatic  mutation 
as  a  proof-of-concept  for  this  inducible  system.  This  approach  also  left  open  the  option  to  go 
back  and  characterize  an  intermediate  round,  or  pursue  additional  rounds  should  analyses 
require  less  or  more  mutations,  respectively. 

A  potential  pitfall  of  this  experimental  approach  is  the  possibility  of  selecting  cells  that  have 
inactivated  the  A3B  expression  construct  or  the  capacity  for  induction  to  avoid  the  cytotoxic 
effects  of  overexpressing  this  DNA  deaminase.  Aliquots  of  cells  from  each  pulse  series  were 
therefore  periodically  tested  by  flow  cytometry  for  A3B-eGFP  inducibility,  western  blot  for 
protein  expression,  and  ssDNA  deamination  assays  for  enzymatic  activity  (e.g.,  Fig  1).  Even 
after  the  tenth  induction  series,  the  A3B-eGFP  daughter  clones  performed  similar  to  original 
daughter  cultures  as  well  as  to  daughter  cultures  that  had  been  grown  continuously  in  parallel 
to  the  Dox-exposed  experimental  cultures  and  diluted  to  mimic  the  population  dynamics 
caused  by  each  A3B-eGFP  exposure  (e.g.,  Fig  1).  These  observations  indicate  that,  despite  neg¬ 
ative  selection  pressure  imposed  by  A3B-eGFP  mediated  DNA  damage,  resistance  or  escape 
mechanisms  did  not  become  overt. 

Targeted  DNA  Sequencing  Provides  Evidence  for  A3B  Mutagenesis 

Next,  target  gene  3D-PCR  and  sequencing  were  used  to  determine  if  the  cells  within  each 
daughter  culture  had  accumulated  detectable  levels  of  mutation  after  10  rounds  of  A3B-eGFP 
exposure.  3D-PCR  is  a  technique  that  enables  the  preferential  recovery  of  DNA  templates  with 
C-to-T  transitions  and/or  C-to-A  transversions,  because  these  mutations  cause  reduced  hydro¬ 
gen  bonding  potential  and  yield  DNA  molecules  that  can  be  amplified  at  PCR  denaturation 
temperatures  lower  than  those  required  to  amplify  the  original  non-mutated  sequences  [13,  38, 
39].  MYC  and  TP53  were  selected  as  target  genes  for  this  analysis  because  our  prior  work  with 
transiently  over-expressed  A3B  and  by  others  with  related  A3  family  members  has  demon¬ 
strated  that  these  genomic  regions  are  susceptible  to  enzyme-catalyzed  deamination  [13,  34, 
40-46], 

The  3D-PCR  and  DNA  sequencing  analyses  revealed  substantially  more  mutations  in  MYC 
and  TP53  in  A3B-eGFP  exposed  daughter  cultures  in  comparison  to  controls  (Fig  2D-2G). 

For  instance,  in  the  C-series  daughter  clone  43  mutations,  mostly  C-to-T  transitions,  were  evi¬ 
dent  in  MYC  amplicons  from  A3B-eGFP  exposed  cultures,  whereas  only  9  mutations  were 
found  in  a  similar  number  of  control  amplicons  (mutation  plot  on  left  side  of  Fig  2D; 
p  =  0.00036,  Student’s  two-tailed  t-test).  The  mutation  load  per  amplicon  was  also  higher  (pie 
graphs  on  right  side  of  Fig  2D).  Similar  results  were  obtained  for  TP53  (Fig  2E;  p  =  0.11,  Stu¬ 
dent's  two-tailed  t-test),  as  well  as  for  both  MYC  and  TP53  in  a  parallel  set  of  independently 
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derived  A-series  daughter  clones  (Fig  2F  and  2G;  p<0.0001  and  p<0.0001,  respectively,  Stu¬ 
dent's  two-tailed  t-test).  The  differences  between  A3B-eGFP  exposed  and  control  conditions 
were  statistically  significant  for  three  of  four  conditions  and,  taken  together,  these  results  pro¬ 
vided  strong  confirmation  that  10  rounds  of  A3B-eGFP  exposure  caused  increased  levels  of 
genomic  DNA  mutagenesis. 

Genome  Wide  Mutation  Analyses 

The  experiments  described  used  pools  of  cells  and,  due  to  the  largely  stochastic  nature  of  the 
A3B  mutational  process  and  the  duration  of  the  pulse  series,  each  pool  would  be  expected  to 
manifest  extreme  genetic  heterogeneity.  This  complexity  would  constrain  a  standard  deep 
sequencing  approach  by  enabling  only  the  earliest  arising  mutations  to  be  detected  in  the  pool 
because  most  subsequent  mutations  would  persist  at  frequencies  too  low  for  reliable  detection. 
To  reduce  this  complexity  to  a  manageable  level  and  be  able  to  investigate  the  mutational  his¬ 
tory  of  a  single  cell  exposed  to  iterative  rounds  of  either  A3B-eGFP  or  eGFP,  we  used  limiting 
dilution  to  generate  “granddaughter”  subclones  from  the  tenth  generation  daughter  pools.  The 
strength  of  this  strategy  is  that  any  new  base  substitution  in  a  single  daughter  cell,  which 
occurred  between  the  time  the  daughter  clone  was  originally  generated  until  the  recovery 
period  following  the  tenth  Dox  treatment,  would  be  fixed  in  the  granddaughter  clonal  popula¬ 
tion  at  a  predictable  allele  frequency  depending  on  local  chromosome  ploidy  (i.e.,  new  muta¬ 
tions  would  be  expected  at  50%  in  diploid  regions,  33%  in  triploid  regions,  25%  in  tetraploid 
regions,  etc.,  of  the  293  cell  genome). 

The  dynastic  relationship  between  mother,  daughter,  and  granddaughter  clones  in  this 
study  is  shown  in  Fig  3A.  To  provide  initial  estimates  of  the  overall  level  of  new  base  substitu¬ 
tion  mutations,  genomic  DNA  was  extracted  from  each  granddaughter  clone  and  subjected  to 
single  nucleotide  polymorphism  (SNP)  analysis  using  the  Illumina  OmniExpress  Bead  Chip.  A 
base  substitution  mutation  was  defined  as  a  clear  SNP  difference  between  each  daughter  clone 
and  her  respective  granddaughter  clone.  These  analyses  revealed  a  wide  range  of  SNP  alterations 
among  granddaughter  clones,  ranging  from  a  low  of  <500  in  the  C-series  eGFP  expressing 
granddaughter  subclone  CGI  to  a  high  of  over  8,000  in  the  A-series  A3B-eGFP  expressing  grand¬ 
daughter  subclone  AA3  (Fig  3B).  This  extensive  variability  was  expected  based  on  the  sublethal 
Dox  concentration  used  in  each  exposure  round,  the  randomness  of  granddaughter  clone  selec¬ 
tion,  and  the  stochastic  nature  of  the  mutation  processes.  Nevertheless,  A3B-eGFP  exposed 
granddaughter  clones  had  an  average  of  3.4-fold  more  new  cytosine  mutations  than  the  eGFP 
controls  (averages  shown  by  dashed  vertical  lines  in  Fig  3B).  Sanger  sequencing  of  cloned  PCR 
products  was  used  to  confirm  several  distinct  SNP  alterations  and  provided  an  orthologous  vali¬ 
dation  of  this  array-based  approach  (e.g.,  representative  chromatograms  of  mutations  in  grand¬ 
daughter  CA1  versus  corresponding  non-mutated  sequences  from  CG2  in  Fig  3C).  In  addition, 
hundreds  more  genomic  copy  number  alterations  were  evident  in  A3B-eGFP  exposed  grand¬ 
daughters  in  comparison  eGFP  controls  (Fig  3D).  Interestingly,  the  overall  number  of  copy  num¬ 
ber  alterations  appeared  to  correlate  positively  with  the  overall  number  of  cytosine  mutations, 
suggesting  that  many  A3B-catalyzed  genomic  DNA  deamination  events  are  likely  processed  into 
DNA  breaks  and  result  in  larger-scale  copy  number  aberrations  (Fig  3E). 

A3B  Mutational  Landscape  by  Whole  Genome  Sequencing 

Next,  whole  genome  sequencing  (WGS)  was  done  to  assess  the  mutation  landscape  for  3  A3B- 
eGFP  exposed  and  3  eGFP  control  granddaughter  clones  from  two  distinct  biological  replica 
experiments  (granddaughters  depicted  in  Fig  3A).  Samples  were  sequenced  using  the  Illumina 
X  Ten  platform  at  the  Beijing  Genome  Institute.  Approximately  700  million  150  bp  paired-end 
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Fig  3.  SNP  analyses  to  estimate  new  mutation  accumulation.  (A)  A  dynastic  tree  illustrating  the  relationship  between  mother,  daughter,  and 
granddaughter  clones  used  for  SNP  and  WGS  experiments.  The  red,  dashed  box  around  the  daughter  clones  denotes  1 0  cycles  of  Dox-treatment. 
(B)  A  histogram  summarizing  the  SNP  alterations  observed  in  granddaughter  clones  by  microarray  hybridization.  Red,  blue,  and  black  colors 
represent  C-to-T,  C-to-A,  and  C-to-G  mutations,  respectively.  (C)  Sanger  sequencing  chromatograms  confirming  representative  cytosine  mutations 
predicted  by  SNP  analysis.  The  left  chromatogram  shows  a  G-to-A  transition  (C-to-T  on  the  opposite  strand)  and  the  right  chromatogram  a  C-to-G 
transversion.  (D)  A  histogram  plot  of  the  total  number  of  copy  number  (CN)  alterations  in  the  indicated  categories  in  A3B-eGFP  exposed 
granddaughter  clones  in  comparison  to  eGFP  exposed  controls,  which  were  normalized  to  zero  in  order  to  make  this  comparison.  (E)  A  dot  plot  and 
best-fit  line  of  data  in  panel  B  versus  data  in  panel  D. 


doi:10.1371/journal.pone.0155391.g003 


reads  were  generated  for  each  genome,  with  an  average  read  depth  of  34.5  ±  2.8  (SD)  per  locus. 
Reads  were  aligned  against  the  hgl9  genome  with  BWA  and  somatic  mutations  were  called 
using  both  VarScan2  (Washington  University,  MO)  and  MuTect  (Broad  Institute,  MA),  with 
the  intersection  of  the  results  these  two  methods  identifying  unambiguous  mutations  for  fur¬ 
ther  analysis  [47,  48]. 
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Using  this  conservative  approach  for  mutation  identification,  a  total  of  6741,  3496,  and 
3530  somatic  mutations  occurred  at  cytosines  in  granddaughter  clones  that  had  been  subjected 
to  10  rounds  of  A3B-eGFP  pulses  in  comparison  to  only  910  and  1531  cytosine  mutations  in 
the  eGFP  controls,  consistent  with  the  results  of  the  SNP  analyses  described  above  (p  =  0.018, 
Student’s  t-test;  Fig  4A;  SI  Table).  In  particular,  the  A3B-eGFP  pulsed  granddaughter  clones 
had  higher  proportions  of  C-to-T  mutations  than  the  eGFP  controls,  59%,  54%,  and  52%  ver¬ 
sus  36%  and  47%,  respectively  (red  slices  in  pie  graphs  in  Fig  4B).  The  A3B-eGFP  pulsed 
granddaughter  clones  also  had  higher  proportions  of  mutations  at  A/T  base  pairs  suggesting 
that  genomic  uracil  lesions  introduced  by  A3B  may  be  processed  by  downstream  error-prone 
repair  processes  analogous  to  those  involved  in  AID-dependent  somatic  hypermutation  of 
immunoglobulin  genes  [49]  (Fig  4A). 

However,  despite  finding  significantly  higher  base  substitution  mutation  loads  in  A3B- 
eGFP  pulsed  granddaughter  clones,  the  overall  distributions  of  cytosine  mutations  within  the 
16  possible  trinucleotide  contexts  appeared  visually  similar  for  the  A3B-eGFP  and  eGFP  con¬ 
trols  (histograms  comparing  the  absolute  frequencies  of  cytosine  mutations  with  the  16  possi¬ 
ble  trinucleotide  contexts  are  shown  in  Fig  4C).  This  result  was  initially  surprising  because  we 
had  expected  obvious  differences  between  the  A3B-induced  mutation  spectrum  and  that  attrib¬ 
utable  to  other  mechanisms,  particularly  within  5’TC  contexts.  However,  a  closer  inspection  of 
the  eGFP  control  data  sets  strongly  indicated  that  this  293-based  system  has  a  mutator  pheno¬ 
type  possibly  due  to  a  defective  replicative  DNA  polymerase  proofreading  domain  and/or  com¬ 
promised  post-replication  mismatch  repair  [50,  51].  For  instance,  the  eGFP  controls  had  large 
numbers  base  substitution  mutations  (predominantly  C-to-A,  C-to-T,  and  T-to-C)  as  well  as 
hallmark  mutation  asymmetries  consistent  with  reported  mutation  spectra  in  mismatch  repair 
defective  tumors  with  microsatellite  instabilities  (SI  Fig)  [5,  37,  51].  Moreover,  each  eGFP  con¬ 
trol  had  over  10,000  insertion/deletion  mutations  ranging  in  size  from  1  to  46  base  pairs  (con¬ 
strained  by  the  length  of  the  Illumina  sequencing  reads). 

Therefore,  to  distinguish  the  A3B-eGFP  induced  mutation  contribution  from  those  caused 
by  intrinsic  sources,  we  used  nsNMF  decomposition  via  the  Somatic  Signatures  R  package  to 
extract  mutational  signatures  from  granddaughter  clones  (Methods).  This  method  extracted 
three  signatures  that  explain  99.6%  of  the  total  variance  in  the  observed  mutation  spectra. 
Extracted  signature  1  (ESI)  had  large  proportions  of  C-to-T  mutations  compared  to  the  raw 
profiles  observed  for  each  sample.  ESI  also  contained  low  proportions  of  C-to-A  mutations. 
The  contribution  of  this  signature  to  the  overall  mutation  profile  was  specifically  enriched  in 
the  A3B-eGFP  pulsed  granddaughter  clones,  contributing  about  75%  of  all  mutations  (Fig  4E). 
Notably,  this  signature  shows  significant  enrichments  for  C-to-T  mutations  within  5’TCG 
motifs,  which  are  biochemically  preferred  by  recombinant  A3B  enzyme  [13, 14,  20]  (Fisher’s 
exact  test  for  ESI  using  the  average  of  the  total  observed  mutations  across  A3B-eGFP  pulsed 
clones:  TCA,  p  =  0.17;  TCC,  p  =  1.00;  TCG,  p  <  0.0001;  TCT,  p  =  0.017).  Moreover,  strong 
enrichments  for  C-to-G  transversion  mutations  were  evident  for  cytosine  mutations  within 
TCW  contexts  (W  =  A  or  T)  in  ESI  in  comparison  to  other  trinucleotide  combinations 
(p  =  0.0001,  Student’s  t-test).  C-to-G  transversions  are  hallmark  A3B-mediated  mutations 
because  other  known  cytosine-biased  mutational  processes  such  as  aging  (spontaneous  deami¬ 
nation  of  methyl-cytosines  in  5’CG  motifs)  and  UV-light  (polymerase-mediated  bypass  of 
cross-linked  pyrimidine  bases)  primarily  result  in  C-to-T  transitions  [2,  52].  Extracted  signa¬ 
tures  2  (ES2)  and  3  (ES3)  were  characterized  by  large  proportions  of  C-to-A  mutations  occur¬ 
ring  independently  of  trinucleotide  motif,  in  contrast  to  ESI.  These  WGS  studies  demonstrated 
increased  genome-wide  mutagenesis  attributable  to  A3B,  even  over  top  of  significant  pre-exist¬ 
ing  mutation  processes  in  this  human  293  cell-based  system. 
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Fig  4.  Summary  of  somatic  mutations  detected  by  WGS.  (A)  Stacked  bar  graphs  representing  total  number  of  C/G  and  T/A  context  somatic  mutations 
in  the  indicated  granddaughter  subclones  (black  and  white  bars,  respectively).  Sequences  from  granddaughter  clone  AG3  were  used  as  a  baseline  to  call 
mutations  in  AA3  (i.e.,  mutations  for  AG3  are  not  shown  in  bar  format  because  WGS  data  from  another  control  granddaughter  clone  were  not  available  for 
comparison).  (B)  Pie  charts  representing  the  proportion  of  each  type  of  cytosine  mutation  across  the  genome  in  the  indicated  granddaughter  clones.  Red, 
blue,  and  black  wedges  represent  C-to-T,  C-to-A,  and  C-to-G  mutations,  respectively.  (C)  Stacked  bar  graphs  representing  the  observed  percentage  of  C- 
context  somatic  trinucleotide  mutations  detected  in  each  granddaughter  clone  from  the  B  panel.  (D)  Stacked  bar  graphs  representing  the  extracted 
mutation  signatures  from  WGS  data.  (E)  The  relative  proportion  that  each  extracted  mutation  signature  contributes  to  the  overall  base  substitution 
spectrum  in  the  indicated  granddaughter  clones. 

doi:10.1371/journal.pone.0155391.g004 


Discussion 

A3B  is  emerging  as  a  significant  source  of  somatic  mutation  in  many  different  cancer  types 
(reviewed  by  [1-4]  and  see  Introduction  for  references  to  primary  literature).  Here,  we  further 
develop  a  293-based  cellular  system  for  conditional,  Dox-mediated  expression  of  A3B.  The  sys¬ 
tem  was  validated  using  flow  cytometry,  immunoblotting,  enzyme  activity  assays,  and,  most 
importantly,  three  complementary  mutation  detection  methods  (3D-PCR,  SNP  array,  and 
WGS).  Our  results  demonstrated  higher  levels  of  cytosine-focused  mutations  in  A3B-eGFP 
expressing  cells,  in  comparison  to  eGFP  controls.  In  particular,  C-to-T  transition  mutations 
and  C-to-G  transversion  mutations  in  A3B  preferred  trinucleotide  motifs  predominated  after 
the  composite  mutation  spectra  were  extracted  into  3  separate  signatures.  These  studies  fortify 
the  conclusion  that  A3B  is  a  potent  human  genomic  DNA  mutagen. 

An  even  more  complex  picture  emerged  by  comparing  the  A3B-induced  mutation  signature 
with  previously  defined  signatures  [5].  ESI,  which  is  attributable  to  A3B  induction  in  this 
293-based  experimental  system,  clustered  most  closely  to  signature  IB,  which  is  characterized 
by  a  dominant  proportion  of  C-to-T  transitions  at  NCG  motifs  attributed  to  spontaneous 
deamination  of  methyl-cytosine  bases,  rather  than  signatures  2  or  13,  which  are  normally 
attributed  to  APOBEC.  A  previous  study  overexpressed  A3B  in  a  different  293-based  system, 
and  observed  a  similarly  complex  cytosine  mutation  distribution  [22].  It  is  therefore  possible 
that  the  intrinsic  preference  of  A3B  for  deaminating  TCA  and  TCG  motifs  may  be  skewed  in 
living  cells  by  downstream  repair  pathways  or  other  mutation  generating  processes.  In  addi¬ 
tion,  although  the  293-based  system  used  here  showed  evidence  for  some  sort  of  repair  defi¬ 
ciency  (below),  ES2  and  ES3  appeared  most  similar  to  signatures  5  and  16,  which  currently 
have  no  known  etiology.  Thus,  the  WGS  data  from  this  293-based  system  indicated  that  the 
overall  “APOBEC”  signature  is  likely  to  be  more  complex  than  inferred  by  prior  studies. 

An  unexpected  outcome  of  our  studies  was  the  discovery  of  a  significant  preexisting  muta¬ 
tion  process  operating  in  this  293-based  system.  It  is  likely  attributable  to  a  defect  in  replicative 
DNA  polymerase  proofreading  function  and/or  in  mismatch  repair  evident  by  microsatellite 
instability  and  pronounced  base  substitution  mutation  biases.  However,  the  molecular  nature 
of  this  defect  is  not  obvious  and  may  be  genetic  and/or  epigenetic.  For  instance,  the  WGS  data 
show  6  exonic  and  over  100  intronic  alterations  to  mismatch  repair  and  related  genes  that 
could  induce  such  a  mutator  phenotype.  These  results  are  consistent  with  a  prior  WGS  study 
that  found  1000’s  of  mutation  differences  between  6  different  293-derived  cell  lines,  as  well  as 
significant  down -regulation  of  MLH1  and  MLH3  in  a  subset  of  lines  [53].  Our  studies  are  also 
consistent  with  at  least  two  additional  prior  reports  characterizing  the  related  293T  cell  line  as 
mismatch  repair  defective  [54,  55].  Regardless  of  the  precise  molecular  explanation,  given  the 
large  number  of  labs  worldwide  that  rely  upon  293  or  293-derived  cell  lines,  knowledge  of  this 
mutator  phenotype  is  likely  to  be  helpful  for  informing  future  experimental  designs  using  this 
system. 

Despite  a  compelling  case  for  A3B  in  cancer  mutagenesis  (key  results  cited  in  Introduc¬ 
tion),  the  overall  APOBEC  mutation  signature  in  cancer  cannot  be  explained  by  A3B  alone, 
because  it  is  still  evident  in  breast  cancers  lacking  the  entirety  of  the  A3B  gene  due  to  a  common 
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deletion  polymorphism  [56].  One  or  more  of  the  other  APOBEC  family  members  with  an 
intrinsic  preference  for  5’TC  dinucleotide  substrates  may  be  responsible.  A  leading  candidate  is 
A3A  due  to  high  catalytic  activity  in  biochemical  assays,  nuclear/cell-wide  localization  in  some 
cell  types,  propensity  to  induce  a  DNA  damage  response  and  cell  death  upon  overexpression, 
and  the  resemblance  of  its  mutation  signature  in  model  systems  to  the  observed  APOBEC  sig¬ 
nature  in  many  cancers  [13, 19, 21,  33,  39, 42,  57-63 ].A3A  gene  expression  may  also  be  dere- 
pressed  as  a  side-affect  of  the  A3B  gene  deletion  [64].  Additional  studies  will  be  needed  to 
unambiguously  delineate  the  identities  of  the  full  repertoire  of  cancer-relevant  APOBEC3 
enzymes,  quantify  their  relative  contributions  to  mutation  in  each  cancer  type,  and  build  upon 
this  fundamental  knowledge  to  improve  cancer  diagnostics  and  therapeutics. 

Supporting  Information 

SI  Fig.  T-to-C  mutations  in  all  samples  exhibit  a  DNA  replication  strand  bias  similar  to 
that  observed  in  MSI  cancers. 

(PDF) 
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Breast  tumors  often  display  extreme  genetic  heterogeneity  characterized  by  hundreds  of  gross  chromosomal  aber¬ 
rations  and  tens  of  thousands  of  somatic  mutations.  Tumor  evolution  is  thought  to  be  ongoing  and  driven  by 
multiple  mutagenic  processes.  A  major  outstanding  question  is  whether  primary  tumors  have  preexisting  mutations 
for  therapy  resistance  or  whether  additional  DNA  damage  and  mutagenesis  are  necessary.  Drug  resistance  is  a  key 
measure  of  tumor  evolvability.  If  a  resistance  mutation  preexists  at  the  time  of  primary  tumor  presentation,  then  the 
intended  therapy  is  likely  to  fail.  However,  if  resistance  does  not  preexist,  then  ongoing  mutational  processes  still 
have  the  potential  to  undermine  therapeutic  efficacy.  The  antiviral  enzyme  APOBEC3B  (apolipoprotein  B  mRNA- 
editing  enzyme,  catalytic  polypeptide-like  3B)  preferentially  deaminates  DNA  C-to-U,  which  results  in  signature 
C-to-T  and  C-to-G  mutations  commonly  observed  in  breast  tumors.  We  use  clinical  data  and  xenograft  experiments 
to  ask  whether  APOBEC3B  contributes  to  ongoing  breast  tumor  evolution  and  resistance  to  the  selective  estrogen 
receptor  modulator,  tamoxifen.  First,  APOBEC3B  levels  in  primary  estrogen  receptor-positive  (ER+)  breast  tumors  in¬ 
versely  correlate  with  the  clinical  benefit  of  tamoxifen  in  the  treatment  of  metastatic  ER+  disease.  Second,  APOBEC3B 
depletion  in  an  ER+  breast  cancer  cell  line  results  in  prolonged  tamoxifen  responses  in  murine  xenograft  experiments. 
Third,  APOBEC3B  overexpression  accelerates  the  development  of  tamoxifen  resistance  in  murine  xenograft 
experiments  by  a  mechanism  that  requires  the  enzyme's  catalytic  activity.  These  studies  combine  to  indicate  that 
APOBEC3B  promotes  drug  resistance  in  breast  cancer  and  that  inhibiting  APOBEC3B-dependent  tumor  evolvability 
may  be  an  effective  strategy  to  improve  efficacies  of  targeted  cancer  therapies. 


INTRODUCTION 

Improvements  in  the  detection  and  therapy  of  operable  breast  tumors 
have  contributed  to  a  steady  decline  in  mortality  (1,  2).  Essentially  all 
breast  cancer  deaths  are  caused  by  metastatic  outgrowths  that 
compromise  vital  organs,  such  as  the  brain,  liver,  or  lungs.  Adjuvant 
systemic  therapies  effectively  reduce  the  risk  of  recurrence  at  these  dis¬ 
tant  metastatic  sites  by  treating  preexisting,  clinically  undetectable, 
micrometastatic  deposits.  In  estrogen  receptor-positive  (ER+)  breast 
cancer,  a  propensity  for  late  recurrence  more  than  5  years  after  surgery 
is  well  documented  and  has  resulted  in  recommendations  to  extend 
adjuvant  endocrine  therapy  for  a  total  of  10  years  (3,  4).  Although  en¬ 
docrine  therapy  may  be  extended,  it  is  evident  that  late  recurrences 
occur  even  while  the  patient  is  taking  appropriate  therapy  (5).  The  late 
recurrence  of  these  apparently  dormant  metastatic  breast  cancer  cells 
may  be  due  to  ongoing  tumor  evolution  and  acquisition  of  additional 
genetic  aberrations. 

Mutations  are  thought  to  be  the  major  drivers  of  recurrence, 
metastasis,  and  therapeutic  resistance.  Recent  studies  on  the  molec¬ 
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ular  origins  of  mutations  in  breast  cancer  have  implicated  several 
molecular  mechanisms,  including  both  spontaneous  and  enzyme- 
catalyzed  deamination  of  DNA  cytosine  bases  (6-10)  [reviewed  by 
Swanton  et  al.  (11),  Roberts  and  Gordenin  (12),  and  Helleday  et  al. 
(13)].  The  former  process  correlates  with  aging  and  is  mostly  due  to 
hydrolytic  conversion  of  5-methyl  cytosine  (mC)  bases  within  5' 
NmCG  motifs  into  thymines,  which  escape  base  excision  repair  and 
are  converted  into  C-to-T  transition  mutations  by  DNA  replication 
(N  =  A,  C,  G,  or  T).  The  latter  process  is  attributable  to  single-stranded 
DNA  cytosine-to-uracil  (C-to-U)  deamination  catalyzed  by  one  or 
more  members  of  the  APOBEC3  (apolipoprotein  B  mRNA-editing 
enzyme,  catalytic  polypeptide-like  3)  family  of  enzymes,  characterized 
by  C-to-T  transitions  and  C-to-G  transversions  in  5'TCW  motifs 
(W  =  A  or  T). 

Human  cells  have  the  capacity  to  express  up  to  seven  distinct 
APOBEC3  enzymes,  which  function  normally  as  overlapping  innate 
immune  defenses  against  a  wide  variety  of  DNA-based  viruses  and 
transposons  [reviewed  by  Malim  and  Bieniasz  (14),  Stavrou  and  Ross 
(15),  and  Simon  et  al.  (16)].  APOBEC3A  (A3A)  and  APOBEC3B 
(A3B)  are  leading  candidates  for  explaining  APOBEC  signature  mu¬ 
tations  in  breast  tumors  because  overexpression  of  these  enzymes 
triggers  DNA  damage  responses  and  inflicts  chromosomal  muta¬ 
tions  in  hallmark  trinucleotide  contexts  (7, 17-21).  However,  endog¬ 
enous  A3A  is  not  expressed  significantly,  nor  is  its  activity  detectable 
in  breast  cancer  cell  lines  (7, 22)  (see  Results).  The  molecular  relevance 
of  A3A  is  therefore  difficult  to  assess  because  the  impact  of  the  endog¬ 
enous  protein  cannot  be  quantified.  In  comparison,  endogenous  A3B 
is  predominantly  nuclear  and  has  been  shown  to  be  responsible  for 
elevated  levels  of  genomic  uracil  and  mutation  in  multiple  breast 
cancer  cell  lines  (7,  22).  A3B  is  overexpressed  in  approximately  50% 
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of  primary  breast  tumors  (7,  8),  and  retrospective  studies  have  asso¬ 
ciated  elevated  A3B  mRNA  levels  with  poor  outcomes  for  adjuvant 
treatment-naive  ER+  breast  cancer  cohorts  (23, 24).  Our  original  studies 
relied  on  a  retrospective  prognostic  analysis  of  a  treatment-naive  ER+ 
breast  cancer  cohort  (23);  therefore,  the  observed  correlation  between 
elevated  A3B  mRNA  levels  and  poor  clinical  outcomes  is  consistent 
with  a  variety  of  therapy-independent  intrinsic  molecular  mechanisms 
ranging  from  indirect  models  (such  as  A3B  promoting  tumor  cell 
growth)  to  direct  models  (such  as  A3B  causing  the  genomic  DNA 
damage  that  results  in  mutations  that  fuel  ongoing  tumor  evolution). 

A  current  debate  in  the  cancer  field  is  whether  the  mutations 
that  cause  therapy  resistance  preexist  in  primary  tumors  (that  is, 
exist  even  before  diagnosis)  or  continually  accumulate  (even  after 
treatment  initiation).  In  support  of  the  former  view,  primary  tumors 
are  often  composed  of  billions  of  cells  that  are  highly  heterogeneous, 
and  deep-sequencing  studies  have  found  known  drug  resistance  mu¬ 
tations  before  therapy  initiation  [for  example,  (25-27)].  However, 
many  studies  also  support  the  latter  view  of  ongoing  tumor  evolu¬ 
tion.  For  instance,  primary  tumor  deep-sequencing  studies  often  fail 
to  find  evidence  for  preexisting  resistance  mutations  [for  example, 
(26,  28)].  Recurrent  breast  tumors  also  often  have  many  more  so¬ 
matic  mutations  compared  to  corresponding  primary  tumors,  sug¬ 
gesting  ongoing  and  cumulative  mutational  processes  (29,  30).  In 
addition,  the  subclonal  nature  of  most  mutations  in  breast  cancer, 
as  well  as  many  other  cancer  types,  provides  strong  evidence  for  on¬ 
going  tumor  evolution,  including  significant  proportions  of  APOBEC 
signature  mutations  (28,  31,  32).  Moreover,  at  the  clinical  level,  the 
fact  that  remission  periods  in  breast  cancer  can  last  for  many  years 
strongly  suggests  that  additional  genetic  changes  are  required  for  at 
least  one  remaining  tumor  cell  to  manifest  as  recurrent  disease  (3,  4). 
Here,  we  test  the  hypothesis  that  A3B  contributes  to  ongoing  tumor 


evolution  and  to  the  development  of  drug  resistance  mutations  in  ER+ 
breast  cancer. 


RESULTS 

Primary  breast  tumor  A3B  mRNA  levels  predict  therapeutic 
failure  upon  tumor  recurrence 

To  determine  whether  A3B  contributes  to  endocrine  therapy 
resistance,  we  evaluated  the  predictive  potential  of  A3B  expression 
in  primary  breast  tumors  from  a  total  of  285  hormone  therapy- 
naive  breast  cancer  patients  who  received  tamoxifen  as  a  first-line 
therapy  for  recurrent  disease  (33).  A  schematic  of  the  study  timeline 
is  shown  in  Fig.  1A,  and  detailed  patient  characteristics  are  shown  in 
table  SI.  Archived  fresh-frozen  primary  tumor  specimens  were  used 
to  prepare  total  RNA,  and  reverse  transcription  quantitative  poly¬ 
merase  chain  reaction  (RT-qPCR)  was  used  to  quantify  A3B  mRNA 
levels.  These  gene  expression  results  were  divided  into  four  quartiles 
for  subsequent  clinical  data  analysis,  with  primary  tumors  of  the 
upper  quartile  expressing  an  average  of  fourfold  to  sixfold  more  A3B 
mRNA  than  those  in  the  lower  quartile  (dark  blue  versus  red  histo¬ 
gram  bars,  respectively,  in  Fig.  IB). 

The  progression-free  survival  (PFS)  durations  following  recur¬ 
rence  and  subsequent  first-line  tamoxifen  therapy  were  compared 
for  each  of  the  four  A3B  expression  groups.  This  analysis  revealed  a 
dose-response  relationship,  with  the  highest  A3B-expressing  group 
associating  with  the  shortest  PFS  and  with  the  lowest  A3B-expressing 
group  associating  with  the  longest  PFS  (Fig.  1C;  log-rank,  P  < 
0.0001).  The  median  PFS  was  6.2  months  for  the  highest  A3B- 
expressing  group  and  14.5  months  for  the  lowest  A3B-expressing 
group  [hazard  ratio  (HR)  2.40  (1.69  to  3.41);  log-rank,  P  <  0.0001]. 
This  result  remained  significant  for  high  versus  low  A3B  levels  in 


A 

ER+ 

primary 

tumor 


Surgery  and 
RT-qPCR  analysis 

- 1 - V/- 


Recurrence  and 
tamoxifen  therapy 


No  endocrine  adjuvant  therapy 


Analysis  of  clinical 
outcome  (PFS) 


B  C 

1.0-1 


Fig.  1 .  High  A3B  levels  in  primary  ER'  breast  tumors  predict  poor  response  to  tamoxifen  therapy  after  tumor  recurrence.  (A)  Schematic  of  the  clinical  time  course. 
Timeline  breaks  depict  variable  intervals  between  clinical  milestones.  (B)  Relative  A3B  expression  levels  in  each  observation  group  [mean  ±  SD  of  n  =  72  (quartiles  1  and 
3),  n  =  70  (quartile  2),  and  n  =  71  (quartile  4)].  (C)  Kaplan-Meier  curves  showing  the  periods  of  PFS  after  initiating  tamoxifen  therapy  for  patients  whose  primary  tumors 
expressed  A3B  at  low  (dark  blue  line),  intermediate  (light  blue  and  orange  lines),  or  high  levels  [red  line;  patient  groups  and  color  scheme  match  those  in  (B)]. 
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a  multivariate  analysis  after  including  the  known  clinical  pathological 
predictors  of  age,  disease-free  interval,  dominant  site  of  relapse,  adju¬ 
vant  chemotherapy,  and  ER  and  progesterone  receptor  mRNA  levels 
measured  in  the  primary  tumor  [HR  2.19  (1.51  to  3.20);  log-rank,  P  < 
0.0001;  table  S2].  These  data  indicate  that  primary  tumor  A3B  mRNA 
levels  are  strong  and  independent  predictors  of  PFS  for  recurrent  ER+ 
breast  cancer  treated  with  tamoxifen.  These  observations  do  not  sup¬ 
port  models  in  which  resistance-conferring  mutations  preexist  in 
primary  tumors — or  disease  outcomes  would  have  had  no  correlation 
with  A3B  expression  levels  and  the  data  for  each  quartile  group  would 
have  superimposed.  Rather,  the  data  support  a  model  in  which  A3B 
promotes  the  ongoing  diversification  of  residual  primary  tumor  cells 
(micrometastatic  deposits)  that  ultimately  manifest  in  the  recurrent 
setting  as  acquired  resistance,  failed  tamoxifen  therapy,  and  disease 
progression. 

Endogenous  A3B  depletion  does  not  alter  the  phenotype  of 
MCF-7L  ER+  breast  cancer  cells  in  culture 

MCF-7  has  been  used  for  decades  as  a  unique  cell-based  model  for 
ER+  breast  cancer  research  [reviewed  by  Lee  et  al.  (34)].  Engrafted 
MCF-7  tumors  are  dependent  on  ER  function  and  therefore  are 
sensitive  to  selective  ER  modulators,  including  tamoxifen.  Further¬ 
more,  tamoxifen-induced  tumor  dormancy  (indolence)  in  this  model 
system,  which  can  last  for  several  months,  frequently  leads  to  drug- 
resistant  and  highly  proliferative  cell  masses.  For  further  studies,  in¬ 
cluding  animal  experiments  below,  we  elected  to  use  the  derivative 
line  MCF-7L  because  it  is  tumorigenic  in  immunodeficient  mice 
[Ibrahim  et  al.  (35),  Sachdev  et  al.  (36),  and  references  therein] 
and  expresses  endogenous  A3B  mRNA  at  levels  approximating  those 
found  in  many  primary  breast  tumors  (7).  Like  most  other  breast 
cancer  cell  lines,  MCF-7L  cells  have  very  low  levels  of  A3A  and  var¬ 
iable  levels  of  other  APOBEC3  mRNAs,  which  have  not  been  impli¬ 
cated  in  breast  cancer  mutagenesis  (fig.  SI). 

We  initially  asked  whether  endogenous  A3B  depletion  alters 
molecular  or  cellular  characteristics  of  MCF-7L.  Cells  were  trans¬ 
duced  with  an  A3B-specific  short  hairpin  RNA  (shRNA)  construct 
(shA3B)  or  a  nonspecific  shRNA  construct  as  a  control  (shCON)  (7), 
and  uniform  shRNA-expressing  pools  were  selected  using  the  linked 
puromycin  resistance  gene.  In  all  shA3B-transduced  pools,  a  robust 
>25-fold  depletion  of  endogenous  A3B  mRNA  was  achieved  (Fig.  2A). 
Moreover,  the  depletion  of  A3B  mRNA  was  mirrored  by  a  corres¬ 
ponding  ablation  of  all  measurable  DNA  cytosine  deaminase  activities 
from  whole-cell  and  nuclear  extracts  (Fig.  2B).  Although  several  other 
APOBEC  family  member  genes  are  expressed  in  MCF-7L,  their  pro¬ 
tein  levels  are  likely  too  low  to  detect  using  this  assay  (A3A,  A3D, 
A3G,  and  Al),  the  enzyme  is  not  active  on  DNA  (A2),  and/or  their 
single-stranded  DNA  cytosine  deaminase  activity  is  not  evident  in 
cellular  extracts  (A3C  and  A3F)  (7,  22).  At  the  microscopic  level, 
shA3B-  and  shCON-expressing  cells  were  visibly  indistinguishable 
(Fig.  2C).  The  two  cell  populations  showed  nearly  identical  growth 
rates  and  doubling  times  in  cell  culture  (Fig.  2,  D  and  E).  These 
results  are  consistent  with  A3B  knockdown  data  using  the  same 
shRNA  construct  in  other  breast  cancer  cell  lines  (7,  22)  and  with 
the  observation  that  A3B  is  a  nonessential  human  gene  (37). 

A3B  is  required  for  the  development  of  tamoxifen-resistant 
tumors  in  mice 

The  clinical  data  reported  in  Fig.  1  support  a  model  in  which  A3B 
is  responsible  for  precipitating  the  mutations  that  promote  tamox- 
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Fig.  2.  Endogenous  A3B  depletion  does  not  alter  MCF-7L  ERH  breast  cancer 
cells  in  culture.  (A)  A3B  mRNA  levels  in  MCF-7L  cells  expressing  shA3B  or  shCON 
constructs  ( TBP ,  TATA-binding  protein  mRNA;  each  bar  represents  the  mean  ±  SD 
of  three  RT-qPCR  assays).  (B)  A3B  DNA  cytosine  deaminase  activity  in  soluble 
whole-cell  (W),  cytoplasmic  (C),  and  nuclear  (N)  extracts  of  MCF-7L  cells 
expressing  shA3B  or  shCON  constructs.  Vector  (V)  and  A3B-transfected  293T  cell 
lysates  were  used  as  controls  (S,  substrate;  P,  product).  (C)  Light  microscopy 
images  of  shA3B  and  shCON  expressing  MCF-7L  pools.  (D  and  E)  Growth  kinetics 
and  doubling  times  of  cultured  MCF-7L  cells  expressing  shA3B  versus  shCON 
constructs  (mean  ±  SD  of  n  =  6  cultures  per  condition). 


ifen  resistance.  To  directly  test  this  model,  we  performed  a  series  of 
xenograft  experiments  using  MCF-7L  pools  in  which  endogenous 
A3B  was  left  intact  (shCON)  or  was  depleted  with  the  specific 
shRNA  described  above  (shA3B).  For  each  condition,  5  million 
cells  were  injected  subcutaneously  into  the  flank  regions  of  a  cohort 
of  5-week-old  immunodeficient  mice,  and  tumors  were  allowed  to 
reach  a  volume  of  approximately  150  mm3.  At  this  point,  typically 
40  to  50  days  after  engraftment,  the  mice  in  each  experimental 
group  were  randomly  assigned  into  two  subcohorts,  one  to  receive 
daily  tamoxifen  injections  and  the  other  to  be  observed  in  parallel 
as  a  control  (schematic  of  experimental  design  in  Fig.  3A). 
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Fig.  3.  A3B  is  required  for  the  development  of  tamoxifen-resistant  tumors  in 
mice.  (A)  Schematic  of  the  A3B  knockdown  xenograft  study  design  and  time 
course  (see  text  for  details).  (B)  Growth  kinetics  of  engrafted  MCF-7L  cells 
expressing  shA3B  or  shCON  in  the  absence  or  presence  of  tamoxifen  (TAM)  treatment. 
Tumor  volumes  were  measured  weekly  (mean  +  SEM  shown  for  clarity  of  data  presen¬ 
tation).  (C)  A3B  mRNA  levels  in  xenografted  tumors  recovered  from  the  experiment 
shown  in  (B)  (TBP  mRNA;  each  bar  represents  the  mean  ±  SD  of  three  RT-qPCR  assays). 
(D)  MTT  [3-(4,5-dimethylthiazol-2-yl)-2,5-diphenyltetrazolium  bromide]  data  compar¬ 
ing  tamoxifen  susceptibility  of  input  MCF-7L  cells  versus  tamoxifen  resistance  of  a 
representative  MCF-7L  shCON  tumor  [tamoxifen  (10,  100,  and  1000  ng/ml)]. 


Control-transduced  MCF-7L  cells  formed  large  1000-mm3  tu¬ 
mors  within  100  days  after  engraftment  and,  interestingly,  A3B 
knockdown  caused  a  modest  delay  in  tumor  growth  (open  blue 
versus  open  orange  symbols  in  Fig.  3B;  linear  mixed  model,  F  test, 


P  =  0.002).  This  result  differed  from  the  near-identical  growth  rates 
in  cell  culture  (Fig.  2,  D  and  E)  and  may  be  due  to  the  likelihood 
that  additional  adaptations/mutations  are  required  for  monolayer/ 
plastic-conditioned  cells  to  be  able  to  grow  optimally  as  tumors  in 
mice.  As  expected,  tamoxifen  treatment  attenuated  the  growth  of 
both  engineered  pools  (filled  orange  and  blue  symbols  in  Fig.  3B). 
Flowever,  control-transduced  cells  rapidly  developed  resistance  to  ta¬ 
moxifen  and  grew  into  large  tumors,  whereas  the  growth  of  the  A3B- 
depleted  cell  masses  was  mostly  suppressed  by  tamoxifen  over  the 
year-long  duration  of  this  representative  experiment  (filled  orange 
versus  blue  symbols  in  Fig.  3B;  linear  mixed  model,  F  test,  P  < 
0.0001).  Similar  outcomes  were  observed  in  additional  experiments 
(for  example,  fig.  S2). 

Xenograft  tumor  A3B  mRNA  levels  were  analyzed  by  RT-qPCR, 
and,  in  all  instances,  the  intended  knockdown  or  control  mRNA 
level  was  found  to  be  durable  and  maintained  through  the  entire  du¬ 
ration  of  the  experiment  (Fig.  3C).  This  series  of  control  experiments 
also  revealed  that  endogenous  A3B  mRNA  levels  increase  in  control 
shRNA-transduced  tumor  masses  in  comparison  to  the  same  cells 
before  engraftment  (Fig.  3C).  The  mechanism  for  A3B  induction 
in  immunodeficient  mice  is  not  known  but  is  unlikely  to  be  due  to 
estrogen  (figs.  S3  and  S4),  as  suggested  by  a  recent  report  (38).  Rep¬ 
resentative  xenografts  were  recovered  in  culture,  and  the  tamoxifen- 
resistant  phenotype  was  reconfirmed  (for  example,  Fig.  3D).  These 
results  are  fully  supportive  of  a  mechanism  in  which  endogenous 
A3B  causes  an  inheritable  drug  resistance  phenotype  (addressed  fur¬ 
ther  below).  It  is  notable  that  endogenous  A3B  mRNA  levels  in  this 
system  are  comparable  to  those  observed  in  a  large  proportion  of 
primary  tumors  [approximately  0.1  to  0.2  relative  to  TBP  mRNA 
levels  in  cultured  MCF-7L  cells  (Fig.  2B),  0.4  relative  to  TBP  in  ani¬ 
mal  tumors  described  here  (Fig.  3C  and  fig.  S3),  and  a  range  of  0  to 
1.25  and  a  median  of  0.25  relative  to  TBP  in  primary  breast  tumors 
previously  documented  using  the  same  RT-qPCR  assay  (7)]. 

A  novel  lentivirus-based  system  enables  A3B  overexpression 
in  any  cell  type 

We  next  developed  a  conditional  A3B  overexpression  system  to  fur¬ 
ther  test  the  A3B  mutagenesis  model.  A  conditional  approach  is  re¬ 
quired  because  A3B  expression  in  virus-producing  cells  causes  lethal 
mutagenesis  of  retroviral  complementary  DNA  intermediates  during 
reverse  transcription  (39-42),  and  excessive  levels  of  cellular  A3B 
have  the  potential  to  inflict  genomic  DNA  damage  that  ultimately 
leads  to  cytotoxicity  (7, 18, 19).  We  therefore  developed  a  novel  lenti- 
viral  construct  that  will  only  express  A3B  upon  transduction  into  sus¬ 
ceptible  target  cells  (Fig.  4A).  This  construct  mitigates  viral  toxicity 
issues  because  it  is  inactive  in  virus-producing  cells  as  a  result  of  dis¬ 
ruption  of  the  antisense  A3B  open  reading  frame  with  a  sense  strand 
intron,  and  it  is  only  expressed  after  intron  removal  by  splicing  in  the 
virus-producing  cells  and  reverse  transcription  and  integration  of  the 
full  proviral  DNA  in  susceptible  target  cells.  It  also  mitigates  toxicity 
issues  for  target  cell  populations  because  expression  levels  are  not 
excessive  (see  below).  In  parallel,  an  A3B  catalytic  mutant  derivative 
(E255Q)  was  created  by  site-directed  mutagenesis  to  serve  as  a  neg¬ 
ative  control. 

Transducing  viruses  were  made  by  plasmid  transfection  into  293T 
cells  with  appropriate  retroviral  helper  plasmids  encoding  Gag,  Pol, 
and  Env  (vesicular  stomatitis  virus  glycoprotein).  As  anticipated,  no 
producer  cell  toxicity  was  observed,  and  A3B  and  A3B-E255Q  viral 
titers  were  equivalent  by  RT-qPCR.  MCF-7L  cells  were  transduced 
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Fig.  4.  Novel  lentivirus-based  system  for  conditional  A3B  overexpression.  (A)  Schematic  of  the  lentiviral  construct  for  conditional  A3B  overexpression  (see  text  for 
details).  LTR,  long  terminal  repeat;  RSV,  Rous  sarcoma  virus;  CTD,  C-terminal  domain;  NTD,  N-terminal  domain;  CMV,  cytomegalovirus;  SV40,  simian  virus  40.  (B)  A3B 
mRNA  levels  relative  to  TBP  in  MCF-7L  cells  expressing  lentivirus-delivered  ABB  or  a  catalytic  mutant  derivative  (E255Q)  as  well  as  endogenous  ABB  (mean  ±  SD  of  three 
RT-qPCR  assays).  (C)  Doubling  times  of  cultured  MCF-7L  cells  overexpressing  A3B  or  A3B-E255Q  (mean  ±  SD  of  four  replicates). 


with  each  virus  stock,  and  puromycin  selection  was  used  to  eliminate 
nontransduced  cells  and  to  ensure  100%  transduction  efficiencies. 
A3B  quantification  by  RT-qPCR  showed  that  each  construct  elevates 
mRNA  expression  to  levels  approximately  10-fold  higher  than  those 
of  the  reference  gene  TBP  (Fig.  4B),  which  equate  to  levels  approxi¬ 
mately  50-fold  higher  than  those  of  the  endogenous  A3B  expressed 
in  this  system.  These  A3B  mRNA  levels  are  similar  to  those  found 
in  the  top  fraction  of  breast  tumors  and  cancer  cell  lines  [Burns 
et  al.  (7),  Leonard  et  al.  (22),  Sieuwerts  et  al.  (23),  and  this  study]. 
As  for  the  A3B  knockdown  experiments  above,  A3B-  and  A3B- 
E255Q-overexpressing  MCF-7L  populations  showed  no  overt  signs 
of  toxicity  and  indistinguishable  growth  rates  (Fig.  4C). 

Overexpression  of  catalytically  active  A3B  accelerates  the 
development  of  tamoxifen-resistant  tumors 

To  further  test  the  model  in  which  A3B  provides  mutagenic  fuel  for 
tumor  evolution  and  drug  resistance,  we  performed  a  series  of 
xenograft  experiments  using  MCF-7L  cells  transduced  with  the 
aforementioned  constructs  and  thereby  overexpressing  wild-type 
A3B  or  the  catalytic  mutant  derivative  A3B-E255Q  (Fig.  5A).  Im- 
munodeficient  animals  were  injected  subcutaneously  with  5  million 
cells  and,  upon  palpable  tumor  growth  (150  mm3),  randomly 
divided  into  groups  for  tamoxifen  injections  or  control  observation. 
Remarkably,  most  of  the  cell  masses  overexpressing  A3B  developed 
rapid  resistance  to  tamoxifen  (filled  red  symbols  in  Fig.  5B).  In 
comparison,  MCF-7L  cells  expressing  equivalent  levels  of  A3B- 
E255Q  mutant  mRNA  showed  resistance  kinetics  similar  to  those 
of  the  shCON  engraftments  described  above  (filled  orange  symbols 
in  Fig.  5B;  linear  mixed  model,  F  test,  P  =  0.015).  An  independent 
experiment  yielded  similar  results  (fig.  S5).  These  data  demonstrate 
that  A3B  overexpression  accelerates  the  kinetics  of  the  development 
of  tamoxifen  resistance  and,  notably,  that  this  phenotype  requires 
catalytic  activity. 
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Fig.  5.  Overexpression  of  catalytically  active  A3B  accelerates  the  develop¬ 
ment  of  tamoxifen-resistant  tumors  in  mice.  (A)  Schematic  of  the  A3B  overex¬ 
pression  xenograft  study  design  and  time  course  (see  text  for  details).  (B)  Growth 
kinetics  of  engrafted  MCF-7L  cells  overexpressing  A3B  or  A3B-E255Q  in  the  ab¬ 
sence  or  presence  of  tamoxifen  treatment.  The  graph  reports  tumor  volumes 
measured  weekly  (mean  +  SEM  shown  for  clarity  of  data  presentation).  Average 
tumor  volumes  from  the  untreated  control  arms  are  shown  by  gray  symbols,  and 
overlapping  error  bars  are  omitted  for  clarity  of  presentation. 
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ESR1  mutations  are  not  responsible  for  tamoxifen  resistance 
in  the  MCF-7L  model  for  ER+  breast  cancer 

Although  the  development  of  tamoxifen-resistant  breast  tumors  is 
a  major  clinical  problem,  in  most  cases  the  molecular  basis  for 
resistance  is  unknown.  A  small  fraction  of  treated  patients  develop 
tumors  with  ESR1  exonic  mutations  that  cause  amino  acid  changes 
in  the  hormone-binding  domain  of  the  ER.  These  mutations  have 
been  seen  mostly  in  tumors  resistant  to  aromatase  inhibitors  and 
not  as  frequently  in  tumors  resistant  to  tamoxifen  [reviewed  by 
Clarke  et  al.  (43)  and  Jeselsohn  et  al.  (44)].  To  determine  whether 
ESR1  mutations  are  also  part  of  the  tamoxifen  resistance  mecha¬ 
nism  in  MCF-7L  cells,  we  performed  DNA  exome  sequencing  on 
9  independent  tamoxifen-resistant  xenografts  and  10  independent 
control  tumor  masses.  The  ESR1  gene  contained  no  mutations  un¬ 
der  either  condition  (see  table  S3  for  a  full  list  of  base  substitution 
mutations).  Resistant  tumor  ESR1  mRNA  levels  were  somewhat 
variable  but  still  similar  to  those  present  in  the  original  MCF-7L 
cell  populations  (fig.  S6).  Together  with  the  data  presented  above 
indicating  heritable  resistance  to  tamoxifen  (Fig.  3D),  these  results 
suggest  that  at  least  one  other  resistance  mechanism  occurs  in  the 
MCF-7  model  system  for  ER+  breast  cancer. 


DISCUSSION 

The  clinical  and  xenograft  results  presented  here  strongly  support  a 
model  in  which  A3B  drives  tamoxifen  resistance  in  ER+  breast 
cancer.  Clinically,  resistance  to  endocrine  therapies  has  been  defined 
as  primary  or  secondary,  depending  on  the  length  of  time  a  patient 
benefits  from  ER-targeted  therapy.  Our  data  suggest  that  A3B  may 
have  a  role  in  both  kinds  of  resistance  and  particularly  in  the  develop¬ 
ment  of  secondary,  acquired  resistance.  Suppression  of  endogenous 
levels  of  A3B  enhances  tamoxifen  benefit  (Fig.  3),  whereas  overex¬ 
pression  of  A3B  eliminates  almost  all  benefits  from  tamoxifen  ther¬ 
apy  (Fig.  5).  Because  the  only  known  biochemical  activity  of  A3B  is 
single-stranded  DNA  cytosine  deamination  [for  example,  (7,  42,  45)] 
and  the  tamoxifen  resistance  phenotype  is  heritable  (Fig.  3D),  the 
most  likely  mechanism  is  A3B-catalyzed  DNA  C-to-U  editing  coupled 
to  the  processing  of  these  uracil  lesions  into  somatic  mutations  by 
normal  DNA  repair  processes  [reviewed  by  Swanton  et  al.  (11),  Roberts 
and  Gordenin  (12),  and  Helleday  et  al.  (13)].  In  further  support  of  this 
mechanism,  the  catalytic  glutamate  of  A3B  (E255)  is  required  for 
accelerated  tamoxifen  resistance  kinetics  upon  enzyme  overexpression. 

Because  ESR1  mutations  were  not  observed  in  MCF-7L  tamoxifen- 
resistant  tumors,  the  identity  of  the  resistance- conferring  mutations 
in  this  system  will  require  significant  future  studies  and  possibly  even 
whole-genome  sequencing  if  the  predominant  causal  lesions  he  out¬ 
side  the  exomic  fraction  of  the  genome.  The  intrinsic  signature  of 
A3B  may  help  to  identify  candidate  (frequently  mutated)  sites  for 
mechanistic  follow-up.  Then,  for  instance,  genetic  knock-in  experi¬ 
ments  could  be  used  to  unambiguously  establish  a  cause-effect  re¬ 
lationship.  However,  the  resistance-conferring  mutations  (such  as 
gene  translocations,  amplifications,  or  deletions)  could  also  be  complex 
and  difficult  to  recapitulate  precisely  because  DNA  repair  enzymes 
can  readily  process  genomic  uracil  lesions  into  single-  and  double- 
stranded  breaks  (46,  47). 

A3B  has  been  implicated  as  a  dominant  source  of  mutation  in  breast, 
head/neck,  lung,  bladder,  and  cervical  cancers  and — to  a  lesser  but  still 
significant  extent — in  many  other  tumor  types  (7-10,  28,  32,  48,  49). 
The  fundamental  nature  of  the  DNA  deamination  mechanism,  together 


with  the  data  presented  here,  strongly  suggests  that  A3B  may  be  a  gen¬ 
eral  mechanism  of  therapeutic  resistance  to  cancer  therapy.  At  this 
point,  potential  mutagenic  contributions  from  other  APOBEC3  family 
members,  such  as  A3  A,  cannot  be  excluded  fully,  but  they  do  not  ap¬ 
pear  to  manifest  in  the  MCF-7L  system,  nor  are  these  potential  contri¬ 
butions  large  enough  to  prevent  the  significant  association  between 
A3B  expression  levels  and  clinical  outcomes  for  ER+  breast  cancer 
patients  [treatment-naive  data  in  the  studies  by  Sieuwerts  et  al.  (23) 
and  Cescon  et  al.  (24)  and  post-recurrence  tamoxifen  resistance  data 
in  Fig.  1].  Thus,  strategies  to  down-regulate  A3B  activity  or  expression, 
as  reported  here  using  a  specific  shRNA  knockdown  construct  in  a 
model  system  for  ER+  breast  cancer,  may  be  beneficial  as  chemo¬ 
therapeutic  adjuvants  to  “turn  down”  the  mutation  rate,  decrease  the 
likelihood  of  evolving  drug  resistance,  and  prolong  the  clinical  benefit 
of  therapy  for  the  many  cancers  that  are  likely  to  be  driven  by  this 
ongoing  mutational  process. 

MATERIALS  AND  METHODS 
Clinical  studies 

The  clinical  characteristics  of  the  285  patients  [225  from  Rotterdam 
(Erasmus  University  Medical  Center)  and  60  from  Nijmegen  (Radboud 
University  Medical  Center)]  whose  primary  tumor  specimens  and  data 
were  used  here  have  been  described  previously  by  Sieuwerts  et  al.  (33). 
The  protocol  to  study  biological  markers  associated  with  disease  out¬ 
come  was  approved  by  the  medical  ethics  committee  of  the  Erasmus 
University  Medical  Center  (Rotterdam,  Netherlands)  (MEC  02.953); 
for  Nijmegen,  coded  primary  tumor  tissues  were  used  in  accordance 
with  the  Codes  of  Conduct  of  the  Federation  of  Medical  Scientific  So¬ 
cieties  in  the  Netherlands  (www.federa.org/codes-conduct).  Thirty- 
two  patients  presented  with  distant  metastasis  at  diagnosis  or  developed 
distant  metastasis  (including  supraclavicular  lymph  node  metastasis) 
within  1  month  following  primary  surgery  (Ml  patients).  These  32 
patients  and  the  253  patients  who  developed  a  first  recurrence  during 
follow-up  (25  patients  with  local-regional  relapse  and  228  patients 
with  distant  metastasis)  were  treated  with  first-line  tamoxifen.  All  pa¬ 
tients  were  ER+  and  anti-hormonal  therapy-naive,  but  38  patients 
received  adjuvant  chemotherapy.  The  median  time  between  the 
primary  surgery  and  the  start  of  therapy  was  24  months  (range,  0 
to  120  months).  The  median  follow-up  of  patients  alive  at  the  end  of 
follow-up  was  98  months  (range,  9  to  240  months)  after  the  primary 
surgery  and  45  months  (range,  3  to  178  months)  after  the  start  of 
first-line  tamoxifen  therapy.  For  182  patients  (64%),  disease  progres¬ 
sion  occurred  within  6  months  of  the  start  of  the  first-line  therapy 
being  controlled  by  tamoxifen.  At  the  end  of  the  follow-up  period, 
268  (94%)  patients  had  developed  tumor  progression,  and  222 
(78%)  patients  had  died. 

Total  RNA  was  extracted  with  RNA  Bee  (Tel  Test,  Thermo  Fisher 
Scientific  Inc.)  from  30-pm  fresh-frozen  primary  tumor  tissue  sections 
containing  at  least  30%  invasive  tumor  cell  nuclei,  and  mRNA  tran¬ 
scripts  were  quantified  by  RT-qPCR  as  described  previously  by  Sieuwerts 
et  al.  (23).  The  median  A3B  expression  level  in  the  group  of  285  breast 
cancers  was  0.22  relative  to  the  normalized  average  of  three  reference 
genes  [HPRT1,  HMBS,  and  TBP  (23)]. 

DNA  constructs 

A3B  knockdown  and  control  shRNA  constructs  were  described  and 
validated  previously  by  Burns  et  al.  (7)  and  Leonard  et  al.  (50).  The 
A3B  and  A3B-E255Q  lentiviral  expression  constructs  were  based 
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on  the  pLenti4TO  backbone  (Life  Technologies).  Overlapping  PCR 
was  used  to  place  a  sense-encoded  intron  between  an  antisense- 
encoded  A3B  open  reading  frame  (primers  available  on  request).  A 
cytomegalovirus  promoter  drove  A3B  expression,  and  a  simian  virus 
40  early  promoter  drove  puromycin  resistance.  Constructs  were  ver¬ 
ified  by  DNA  sequencing. 

Cell  culture  studies 

MCF-7L  cells  were  cultured  at  37°C  under  5%  C02  and  maintained 
in  improved  minimum  essential  medium  (Richter’s  modification 
medium)  containing  5%  fetal  bovine  serum,  penicillin  (100  U/ml), 
streptomycin  (100  (tg/ml),  and  11.25  nM  recombinant  human  in¬ 
sulin.  These  cells  were  originally  obtained  from  C.  Kent  Osborne 
(Baylor  College  of  Medicine,  Houston,  TX)  and  are  subject  to  short 
tandem  repeat  analysis  yearly  to  confirm  their  identity  with  the 
original  MCF-7  cell  line.  Cells  were  transduced  with  the  lentivirus- 
based  shRNA  or  conditional  expression  constructs  described  above 
and  selected  with  puromycin  (1  pg/ml;  United  States  Biological)  for 
72  hours  to  generate  uniformly  transduced  pools.  Cell  growth  ex¬ 
periments  were  performed  by  plating  100,000  cells  per  six-well  plate 
and  incubating  them  at  37°C  for  the  indicated  days.  Cells  were  tryp- 
sinized,  diluted  1:2  in  trypan  blue  (Invitrogen),  and  counted  via  a  he- 
mocytometer  (six  biological  repbcates  per  day  per  condition).  Cell 
proliferation  rates  were  determined  using  the  xCELLigence  real-time 
cell  analyzer  dual-plate  instrument  according  to  the  manufacturer’s 
instructions  (ACEA  Biosciences). 

The  mRNA  level  of  each  APOBEC  family  member  gene  was  quan¬ 
tified  using  previously  described  RT-qPCR  protocols  and  primer/ 
probe  combinations  and  presented  relative  to  the  housekeeping  gene 
TBP  (7, 51,52).  ESR1  and  C-MYC  RNA  were  quantified  by  RT-qPCR 
using  intron-spanning  primers  5'-ATGACCATGACCCTCCA- 
CACC  and  5'-TCAGACCGTGGCAGGGAAACC  (UPL24)  and 
5 ' -GCT GCTTAG ACGCT GG ATTT  and  5 ' -T A ACGTT GAGGGG- 
CATCG  (UPL66),  respectively,  and  manufacturer-recommended 
protocols  (LightCycler  480,  Roche).  C-MYC  is  an  established  estrogen- 
responsive  gene  (53). 

DNA  deaminase  activity  was  measured  in  soluble  whole-cell, 
nuclear,  and  cytoplasmic  fractions  of  MCF-7L  cultures  using 
established  protocols  (7,  54).  The  single -stranded  DNA  substrate 
contained  a  single  target  cytosine  (5'-ATTATTATTATTC- 
G  A  AT  GG  ATTT  ATTT  ATTT  ATTT  ATTT  ATTT -fluorescein) ;  de¬ 
amination,  uracil  excision,  and  backbone  cleavage  resulted  in  a 
single  faster-migrating  product  on  SDS-polyacrylamide  gel  electro¬ 
phoresis  and  image  analysis  (Typhoon  FLA  7000  and  ImageQuant 
software,  GE  Healthcare  Life  Sciences). 

Xenograft  studies 

The  University  of  Minnesota  Institutional  Animal  Care  and  Use 
Committee  approved  the  animal  protocols  used  here  (1305- 
30638A).  MCF-7L  cells  were  harvested  at  70%  confluence,  counted, 
and  resuspended  in  serum-free  medium  (without  phenol  red)  at  a 
concentration  of  5  million  cells  per  50  pi  of  final  volume.  Ovariec- 
tomized,  athymic  mice  (Harlan)  were  injected  subcutaneously  in 
the  left  flank  with  50  pi  of  cell  suspension  at  approximately  5  weeks 
of  age.  Each  experiment  was  initiated  with  5  or  10  mice  per  exper¬ 
imental  condition.  One  week  before  injection  and  at  all  times  fol¬ 
lowing,  the  mice  were  provided  with  drinking  water  supplemented 
with  1  pM  (S-estradiol  (Sigma-Aldrich)  (except  for  the  subset  of 
mice  used  in  the  experiment  shown  in  fig.  S3).  Tumors  were  measured 


bidirectionally  twice  weekly,  and  tamoxifen  treatment  began  when  the 
average  tumor  volume  reached  150  mm3.  Tamoxifen  citrate  (500  pg; 
Sigma-Aldrich)  emulsified  in  50  pi  of  peanut  oil  was  administered 
subcutaneously  5  of  7  days  each  week.  Tumor  volumes  were  calculated 
using  the  following  formula:  length  x  breadth2/2. 

MCF-7L  exome  sequencing 

Genomic  DNA  was  prepared  from  tumor  cell  masses  (~20  mg  per  sam¬ 
ple)  via  the  Centra  Puregene  Tissue  DNA  isolation  protocols  (Qiagen). 
Samples  were  diluted  to  100  ng/pl  and  assessed  further  for  quality  and 
purity  by  SYBR  Green  PCR  on  a  197-bp  fragment  of  A3  H  using  primers 
5'-CATGGGACTGGACGAAGCGCA  and  5 '  -  TGGG  AT  CC  AC  AC  A- 
GAAGCCGCA.  Samples  with  no  amplification  were  excluded  from 
the  analysis.  One  microgram  of  total  genomic  DNA  per  sample  was 
subjected  to  whole-exome  sequencing  on  the  Complete  Genomics 
platform  to  an  average  target  depth  of  lOOx  (BGI).  Reads  were 
aligned  by  BGI  using  its  in-house  pipeline,  and  the  alignments  in  bam 
format  were  used  for  variant  calling.  Somatic  variants  were  called  for 
each  tumor  alignment  by  VarScan  2  (55)  using  an  estrogen- treated 
shA3B  sample  as  the  normal  control.  The  variants  were  filtered  with  a 
minimum  overall  coverage  depth  of  20  reads  and  a  minimum  coverage 
depth  of  4  reads  for  the  alternate  allele.  Any  variant  occurring  at  any 
frequency  above  0  at  the  same  position  in  more  than  one  sample  was 
considered  a  common  mutation  in  the  input  pool  and  was  removed. 
A  full  list  of  base  substitution  mutations  is  provided  in  table  S3. 

Statistics 

Comparisons  of  the  PFS  of  hormone-naive  breast  cancer  patients 
following  treatment  for  first  recurrence  with  tamoxifen,  by  A3B  ex¬ 
pression  level  (divided  into  quartiles),  were  conducted  using  log- 
rank  tests;  HRs  and  95%  confidence  intervals  are  presented  for 
pairwise  comparisons.  Clinical  data  were  analyzed  using  SPSS  Sta¬ 
tistics  version  23.0  (IBM).  In  the  xenograft  studies,  repeated  mea¬ 
sures  of  tumor  volume  over  time  were  compared  by  treatment 
group  using  linear  mixed  models  with  fixed  effects  for  treatment, 
days,  and  interaction  between  treatment  and  days  and  with  random 
intercept  and  slope  effects  for  each  mouse.  P  values  <0.05  were 
considered  statistically  significant.  Xenograft  data  were  analyzed 
using  Prism  6  and  SAS  9.3. 


SUPPLEMENTARY  MATERIALS 

Supplementary  material  for  this  article  is  available  at  http://advances.sciencemag.org/cgi/ 
content/full/2/1 0/el  601 737/DC1 

fig.  SI.  APOBEC  family  member  expression  in  MCF-7L  cells. 

fig.  S2.  Replica  A3B  knockdown  xenograft  experiment. 

fig.  S3.  Estrogen  does  not  affect  A3B  mRNA  levels  in  engrafted  MCF-7L  cells. 

fig.  S4.  A3B  is  not  estrogen-inducible. 

fig.  S5.  Replica  A3B  overexpression  xenograft  experiment. 

fig.  S6.  ESR1  mRNA  levels  in  tamoxifen-resistant  MCF-7L  cells. 

table  SI.  Patient  characteristics  and  median  and  interquartile  range  of  APOBEC3B  mRNA  levels, 
table  S2.  Cox  univariate  and  multivariate  analyses  for  PFS  after  initiating  first-line  tamoxifen, 
table  S3.  Single-base  substitution  mutations  in  tamoxifen-resistant  tumors  (separate  Microsoft 
Excel  file). 
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